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Abstract 

We provide a quick overview of various calculus tools and of the main results concerning 
the heat flow on compact metric measure spaces, with applications to spaces with lower 
Ricci curvature bounds. 

Topics include the Hopf-Lax semigroup and the Hamilton- Jacobi equation in metric 
spaces, a new approach to differentiation and to the theory of Sobolev spaces over metric 
measure spaces, the equivalence of the L^-gradient flow of a suitably defined "Dirichlet 
energy" and the Wasserstein gradient flow of the relative entropy functional, a metric 
version of Brcnicr's Theorem, and a new (stronger) definition of Ricci curvature bound 
from below for metric measure spaces. This new notion is stable w.r.t. measured Gromov- 
Hausdorff convergence and it is strictly connected with the linearity of the heat flow. 
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1 Introduction 

Aim of these notes is to provide a quick overview of the main results contained in [4] and [6] 
in the simplified case of compact metric spaces (X, d) endowed with a reference probability 
measure m. The idea is to give the interested reader the possibility to get as quickly as 
possible the key ideas behind the proofs of our recent results, neglecting all the problems 
that appear in a more general framework (as a matter of fact, no compactness assumption 
is made in [4, 6] and finiteness of m is assumed only in [6]). Passing from compact spaces 
to complete and separable ones (and even to a more general framework which includes the 
so-called Wiener space) is not just a technical problem, meaning that several concepts need to 
be properly adapted in order to achieve such generality. Hence, in particular, the discussion 
here is by no means exhaustive, as both the key statements and the auxiliary lemmas are 
stated in the simplified case of a probability measure in a compact space. 

Apart some very basic concept about optimal transport, Wasserstein distance and gradi- 
ent flows, this paper pretends to be self-contained. All the concepts that we need are recalled 
in the preliminary section, whose proofs can be found, for instance, in the first three chapters 
of [1] (for an overview on the theory of gradient fiows, see also [3], and for a much broader 
discussion on optimal transport, see the monograph by Villani [32]). For completeness rea- 
sons, we included in our discussion some results coming from previous contributions which 
are potentially less known, in particular: the (sketch of the) proof by Lisini [22] of the charac- 
terization of absolutely continuous curves w.r.t. the Wasserstein distance (Proposition 4.21), 
and the proof of uniqueness of the gradient flow of the relative entropy w.r.t. the Wasserstein 
distance on spaces with Ricci curvature bounded below in the sense of Lott-Sturm- Villani 
{CD{K, oo) spaces in short) given by the second author in [12] (Theorem 5.7). 

In summary, the main arguments and results that we present here are the following. 

(1) The Hopf-Lax formula produces subsolutions of the Hamilton- Jacobi equation, and 
solutions on geodesic spaces (Theorem 3.5 and Theorem 3.6). 

(2) A new approach to the theory of Sobolev spaces over metric measure spaces, which 
leads in particular to the proof that Lipschitz functions are always dense in energy in 
VFi'2(A,d,m) (Theorem 4.26). 

(3) The uniqueness of the gradient flow w.r.t. the Wasserstein distance W2 of the relative 
entropy in CD{K, 00) spaces (Theorem 5.7). 

(4) The identification of the L^-gradient flow of the natural "Dirichlet energy" and the 
VF2-gi'adient flow of the relative entropy in CD{K, 00) spaces (see also [15] for the 
Alexandrov case, a paper to which our paper [4] owes a lot). 
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(5) A metric version of Brenier's theorem valid in spaces having Ricci curvature bounded 
from below in a sense slightly stronger than the one proposed by Lott-Sturm-Villani. 
If this curvature assumption holds (Definition 7.1) and /i, v are absolutely continuous 
w.r.t. m, then "the distance traveled is uniquely determined by the starting point", i.e. 
there exists a map D : X — t- M such that for any optimal plan 7 it holds d(x, y) = D{x) 
for 7-a.e. {x,y). Moreover, the map D is nothing but the weak gradient (according to 
the theory illustrated in Section 4) of any Kantorovich potential. See Theorem 7.3. 

(6) A key lemma (Lemma 8.2) concerning "horizontal" and "vertical" differentiation: it 
allows to compare the derivative of the squared Wasserstein distance along the heat 
flow with the derivative of the relative entropy along a geodesic. 

(7) A new (stronger) definition of Ricci curvature bound from below for metric measure 
spaces which is stable w.r.t. measured Gromov-Hausdorff convergence and rules out 
Finsler geometries (Theorem 9.1 and the discussion thereafter). 

Acknowledgement. The authors acknowledge the support of the ERG ADG GeMeThNES 
and the PRlN08-grant from MlUR for the project Optimal transport theory, geometric and 
functional inequalities, and applications. 

The authors also thank A.Mondino for his careful reading of a preliminary version of this 
manuscript. 

2 Preliminary notions 

As a general convention, we will always denote by {X, d) a compact metric space and by m a 
Borel probability measure on X; we will always refer to the structure (X, d,m) as a compact 
and normalized metric measure space. We will use the symbol {Y, dy ) for metric spaces when 
the compactness is not implicitly assumed. 

2.1 Absolutely continuous curves and slopes 

Let (y, dy) be a complete and separable metric space, J C M an interval with nonempty 
interior and J B t jt ^Y- We say that jt is absolutely continuous if 



for some g £ L^{J). It turns out that, if 7^ is absolutely continuous, there is a minimal 
function g with this property, called metric speed and given for a.e. t S J by 



See [3, Theorem 1.1.2] for the simple proof. Notice that the absolute continuity property of 
the integral ensures that absolutely continuous functions can be extended by continuity to 
the closure of their domain. 

We will denote by C([0, 1],5^) the space of continuous curves on [0, 1] with values in Y 
endowed with the sup norm. The set AC^{[0,1],Y) C C([0,l],y) consists of all absolutely 
continuous curves 7 such that Ijtl"^ dt < 00: it is easily seen to be equal to the countable 




Vs, t E J, t < s 



it\ = hm 



dy(7s,7t) 
Is - t\ 
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union of the closed sets {7 : Jq |7tp dt < n}, and thus it is a Borel subset of C([0, 1], The 
evaluation maps et : C{[0,1],Y) ^ Y are defined by 

etil) ■■= It, 

and are clearly 1-Lipschitz. 

We say that a subset D oiY \s geodesic if for any x, y £ D there exists a curve (74) C -D 
on [0, 1] such that 7o = a^, 7i = 2/ and cly(7t,7s) = \t — s|dy(x,y) for all s, t £ [0, 1]. Such a 
curve is called constant speed geodesic, or simply geodesic. The space of all geodesies in Y 
endowed with the sup distance will be denoted by Geo(Y). 

Given / : y — )■ M U {±00} we define the slope (also called local Lipschitz constant) at 
points X where f{x) G M by 

|p/i(x):=ia l^y-^';>l 

We shall also need the one-sided counterparts of the slope called respectively descending 
slope and ascending slope: 

iD^flix) :=Ii5l [^(^)-^(f" , \D-f\ix) :=m ^^^y]-^^f\ (2.1) 
y^x aY{y,x) y-^x QY{y,x) 

where [•]"'" and [•]" denote respectively the positive and negative part. Notice the change of 
notation w.r.t. previous works of the authors: the slopes and its one-sided counterparts were 
denoted by |V/|, |V^/|. Yet, as remarked in [13], these notions, being defined in duality with 
the distance, are naturally cotangent notions, rather than tangent ones, whence the notation 
proposed here. 

It is not difficult to see that for / Lipschitz the slopes and the local Lipschitz constant are 
upper gradients according to [18], namely 

for any absolutely continuous curve 7 : [0, 1] — t- Y; here and in the following we write / 

for /(71) - /(to) and g for /J g{-is)\is\ ds. 

Also, for /, 5 : y — 7- M Lipschitz it clearly holds 

\D{af + m<MDf\ + mD9l Va,/3GM; (2.2a) 
\D{fg)\<\f\\Dg\ + \g\\Df\. (2.2b) 







Jdf 





2.2 The space {^{X),W^) 

Let (X, d) be a compact metric space. The set I^{X) consists of all Borel probability measures 
on X. As usual, if G I^{X) and T : X Y \s a /x-measurable map with values in 
the topological space Y , the push-forward measure Tj/Li G ^(y) is defined by T^fi{B) := 
fj.(T~^{B)) for every set Borel set B CY. 

Given i' £ ^{X), we define the Wasserstein distance W2{f^, f) between them as 

W^ili, v) := min / d\x, y) d-f{x, y), (2.3) 
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where the minimum is taken among all Borel probability measures 7 on X'^ such that 

njj = ^, Tr'^-f = u; here vr* : -)• X, 7r*(xi, X2) := Xj. 

Such measures are called admissible plans or couplings for the couple {fi, v); a plan 7 which 
realizes the minimum in (2.3) is called optimal, and we write 7 G OPT(/i, i^). From the 
linearity of the admissibility condition we get that the squared Wasserstein distance is convex, 
i.e.: 

Ty|((l-A)/ii + Az.i,(l- A)/i2 + Az.2) < (1 - X)Wi{fii,iyi) + XWi{fi2,i^2). (2.4) 

It is also well known (see e.g. Theorem 2.7 in [1]) that the Wasserstein distance metrizes the 
weak convergence of measures in ^(X), i.e. the weak convergence with respect to the duality 
with C{X); in particular (=^(X), W2) is a compact metric space. 

An equivalent definition of W2 comes from the dual formulation of the transport problem: 

V|(m,z^) =sup /" ^d/i+ /" V'dz/, (2.5) 

^ ^1} J X J X 

the supremum being taken among all Lipschitz functions where the c-transform in this 
formula is defined by 

i^^y) ■■= inf ^^-^(x). 

x£X I 

A function -i/; : X — )• M is said to be c-concave \i 'il) = for some : X — )• M. It is possibile 
to prove that the supremum in (2.5) is always achieved by a c-concave function, and we will 
call any such function a Kantorovich potential. We shall also use the fact that c-concave 
functions satisfy 

= ^. (2.6) 

The (graph of the) c-superdifferential d'^Tp of a c-concave function tp is the subset of 
defined by 



d' 



'■ip := |(x,y) : ^(x) + ^^(y) 



d^(x,y)- 



2 J ' 

and the c-superdifferential d'^'4){x) at x is the set of y's such that (x, y) G d^ip. A consequence 
of the compactness of X is that any c-concave function ip is Lipschitz and that the set d'^il:{x) 
is non empty for any x G X. 

It is not difficult to see that if ^ is a Kantorovich potential for /U, G ^^{X) and 7 is a 
coupling for (/_f, v) then 7 is optimal if and only if supp(7) C d'^ip. 

If (X, d) is geodesic, then so is (=^(X), 1^2), and in this case a curve (/it) is a constant speed 
geodesic from //q to /ii if and only if there exists a measure tt G ^(C([0, 1], X)) concentrated 
on Geo(X) such that (e4)jj7r = for all t G [0, 1] and (eo,ei)j G OPT(/_fo, /^i). We will denote 
the set of such measures, called optimal geodesic plans, by GeoOpt(//0 5 /^i)- 

2.3 Geodesically convex functionals and their gradient flows 

Given a geodesic space (y, dy) (in the following this will always be the Wasserstein space 
built over a geodesic space (X, d)), a functional S : y — )■ M U {+00} is said i^— geodesically 
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convex (or simply ii'-convex) if for any yo, yi G Y there exists a constant speed geodesic 
7 : [0, 1] — )■ y such that 7o = 2/0) 7i = Hi and 

E{7t) < {l-t)E{yo) + tE{yi)-^t{l-t)dl.iyo,yi), Vi G [0,1]. 

We win denote by D[E) the domain of i.e. D{E) := {y : E{y) < oo}: if £^ is geodesically 
convex, then D(E) is geodesic. 

An easy consequence of the i^-convexity is the fact that the descending slope defined in 
(2.1) can de computed as a sup, rather than as a limsup: 

\D-m) = s.p (Sfl-ai + f d,.(!,, .)) ' . (2.7) 

What we want to discuss here is the definition of gradient flow of a ii'-convex functional. 
There are essentially two different ways of giving such a notion in a metric setting. The 
first one, which we call Energy Dissipation Equality (EDE), ensures existence for any K- 
convex and lower semicontinuous functional (under suitable compactness assumptions), the 
second one, which we call Evolution Variation Inequality (EVI), ensures uniqueness and K- 
contractivity of the flow. However, the price we pay for these stronger properties is that 
existence results for EVI solutions hold under much more restrictive assumptions. 

It is important to distinguish the two notions. The EDE one is the "correct one" to be 
used in a general metric context, because it ensures existence for any initial datum in the 
domain of the functional. However, typically gradient flows in the EDE sense are not unique: 
this is the reason of the analysis made in Section 5, which ensures that for the special case of 
the entropy functional uniqueness is indeed true. 

EVI gradient flows are in particular gradient flows in the EDE sense (see Proposition 2.5), 
ensure uniqueness, X-contractivity and provide strong a priori regularizing effects. Heuris- 
tically speaking, existence of gradient flows in the EVI sense depends also on properties of 
the distance, rather than on properties of the functional only. A more or less correct way 
of thinking at this is: gradient flows in the EVI sense exist if and only if the distance is 
Hilbertian on small scales. For instance, if the underlying metric space is an Hilbert space, 
then the two notions coincide. 

Now recall that one of our goals here is to study the gradient flow of the relative entropy in 
spaces with Ricci curvature bounded below (Definition 5.1), and recall that Finsler geometries 
are included in this setting (see page 926 of [32]). Thus, in general we must deal with the 
EDE notion of gradient flow. The EVI one will come into play in Section 9, where we use 
it to identify those spaces with Ricci curvature bounded below which are more 'Riemannian 
like'. 

Note: later on we will refer to gradient flows in the EDE sense simply as "gradient flows", 
keeping the distinguished notation EVI-gradient flows for those in the EVI sense. 



2.3.1 Energy Dissipation Equality 

An important property of X-geodesically convex and lower semicontinuous functionals (see 
Corollary 2.4.10 of [3] or Proposition 3.19 of [1]) is that the descending slope is an upper 
gradient, that is: for any absolutely continuous curve : J C M — )• D{E) it holds 

\E{yt)-E{ys)\< \y,\\D~E\{yr)dr, Vt < s. (2.8) 



6 



An application of Young inequality gives that 

E{yt)<E{y,) + ^J^'\yr\^dr + ^J^'\D-E\\yr)dr, Vt < s. (2.9) 
This inequality motivates the following definition: 

Definition 2.1 (Energy Dissipation Equality definition of gradient flow) Let E he 

a K -convex and lower semicontinuous functional and let y^ E D[E). We say that a con- 
tinuous curve [0,00) B t yt is a gradient flow for the E in the EDE sense (or simply a 
gradient flow) if it is locally absolutely continuous in (0, 00), it takes values in the domain of 
E and it holds 

E{yt)=E{y,) + ^J^'\yr\^dr+^J^'\D~E\^{yr)dr, < s. (2.10) 

Notice that due to (2.9) the equality (2.10) is equivalent to 

E{yo)>E{ys) + \ I IvA^dr+lf \D~E\\yr)dr, Vs > 0. (2.11) 
^ Jo ^ Jo 

Indeed, if (2.11) holds, then (2.10) holds with t = 0, and then by linearity (2.10) holds in 
general. 

It is not hard to check that if : R*^ — )■ M is a function, then a curve : J — t- M'^ is a 
gradient flow according to the previous definition if and only if it satisfies 

y't = -VE{yt), Vt G J, 

so that the metric definition reduces to the classical one when specialized to Euclidean spaces. 
The following theorem has been proved in [3] (Corollary 2.4.11): 

Theorem 2.2 (Existence of gradient fiovi^s in the EDE sense) Let (y, dy) be a com- 
pact metric space and let E : Y ^ M.L) {+00} be a K-geodesically convex and lower semicon- 
tinuous functional. Then every yo G D{E) is the starting point of a gradient flow in the EDE 
sense of E. 

It is important to stress the fact that in general gradient flows in the EDE sense are 
not unique. A simple example is y := endowed with the L°° norm, and E defined by 
E(x,y) := X. It is immediate to see that E is 0-convex and that for any point {xQ,yo) there 
exist uncountably many gradient flows in the EDE starting from it, for instance all curves 
{xo - t,y{t)) with \y'{t)\ < 1 and y(0) = yo- 

2.3.2 Evolution Variational Inequality 

To see where the EVI notion comes from, notice that for a /■^-convex and smooth function / 
on M*^ it holds y'^ = —Vf{y) for any t > if and only if 

±^\yi_^ + y |yt - + f{yt) < f{z), \fz e \ft > 0. (2.12) 

This equivalence is true because i^-convexity ensures that v = —'Vf{y) if and only 

{v,y-z) + ^^\y- z|2 + f{y) < f{z), Vz G M"^. 

Inequality (2.12) can be written in a metric context in several ways, which we collect in the 
following statement (we omit the easy proof). 
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Proposition 2.3 (Evolution Variational Inequality: equivalent statements) Let 

(Yjdy) be a complete and separable metric space, E : Y ^ (—00,00] a lower semicontinuous 
functional. Then the following properties are equivalent. 

(i) For any z G E it holds 

^4(|^ + yd2.(y,,z) + Eivt) < Eiz), for a.e. t G (0,oo). 

(ii) For any z £ E it holds 

dy(?/s,2) - dl{yt,z) K [' 2 



+— y A\{yr,z)dr+j E{yr) dr < {s-t)E{z), VO < t < s < 00. 



(Hi) There exists a set A C D{E) dense in energy (i.e., for any z G F){E) there exists 
(zn) C A converging to z such that E{zn) — E{z) ) such that for any z £ A it holds 

6Uy,,„z) -6Uy,,z) ^ K,^^^^^ ^ ^^^^^ ^ ^^^^^ ^ ^^^^^^ 

Definition 2.4 (Evolution Variational Inequality definition of gradient flow) We 

say that a curve (yt) is a gradient flow of E in the EVI sense relative to K £ M (in short, 
FiYIk -gradient flow), if any of the above equivalent properties are true. We say that yt starts 
from yo if yt ^ yo as t i 0. 

This definition of gradient flow is stronger tlian tfie one discussed in tlie previous section, 
because of tlie following result proved by the third author in [29] (see also Proposition 3.6 of 
[1]), which we state without proof. 

Proposition 2.5 (EVI implies EDE) Let (y, dy) be a complete and separable metric 
space, K £ M., E : Y ^ (—00, , 00] a lower semicontinuous functional and yt : (0, 00) — )■ D{E) 
a locally absolutely continuous curve. Assume that yt is an EVIk- gradient flow for E. Then 
(2.10) holds for any <t < s. 

Remark 2.6 (Contractivity) It can be proved that if (yt) and (zt) are gradient flows in 
the EVIi^ sense of the l.s.c. functional E, then 

dy(yt, Zt) < e-^*dy(2/o, ^0), Vt > 0. 

In particular, gradient flows in the EVI sense are unique. This contractivity property, used 
in conjunction with (ii) of Proposition 2.3, guarantees that if existence of gradient flows in 
the EVI sense is known for initial data lying in some subset S CY, then it is also known for 
initial data in the closure 5" of S*. ■ 

We also point out the following geometric consequence of the EVI, proven in [10]. 

Proposition 2.7 Let E lY ^ (—00,00] be a lower semicontinuous functional on a complete 
space (Yjdy). Assume that every y^ € D{E) is the starting point of an FiWk -gradient flow 
of E. Then E is K-convex along all geodesies contained in D{E). 

As we already said, gradient flows in the EVI sense do not necessarily exist, and their 
existence depends on the properties of the distance dy. For instance, it is not hard to see 
that if we endow with the L°° norm and consider the functional E{x,y) := x, then there 
re is no gradient flow in the EVI^-sense, regardless of the constant K. 
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3 Hopf-Lax formula and Hamilton-Jacobi equation 

Aim of this subsection is to study the properties of the Hopf-Lax formula in a metric setting 
and its relations with the Hamilton-Jacobi equation. Here we assume that (X, d) is a compact 
metric space. Notice that there is no reference measure m in the discussion. 
Let / : X — )• M be a Lipschitz function. For t > define 

F{t,x,y) ■.= fiy) + ^^, 
and the function Q^/ : X — )• M by 

Qtfix) := inf F{t,x,y) = mm F{t,x,y). 

Also, we introduce the functions D^, : X x (0, oo) — t- M as 

D~^(x, t) := max d(x, y), 
D [x,t) := mm a[x,y), 

where, in both cases, the y's vary among all minima of F{t,x, ■). We also set Qq/ = f and 
D^{x,0) = 0. Thanks to the continuity of F and the compactness of X, it is easy to check 
that the map [0, oo) x X 3 (t,x) Qtf{x) is continuous. Furthermore, the fact that / is 
Lipschitz easily yields 

D"(x,t) < L>+(a;,t) < 2tLip(/), (3.2) 

and from the fact that the functions {d^(-, y)}j^gy are uniformly Lipschitz (because {X,A) is 
bounded) we get that Qtf is Lipschitz for any t > 0. 

Proposition 3.1 (Monotonicity of D^) For all x ^ X it holds 

D+{x, t) <D^{x,s), 0<t< s. (3.3) 

As a consequence, D~^{x,-) and D~{x,-) are both nondecreasing, and they coincide with at 
most countably many exceptions in [0, oo). 

Proof Fix x £ X. For t = there is nothing to prove. Now pick < t < s and choose xt 
and Xs minimizers of F(t,x,-) and F{s,x,-) respectively, such that d(x,xj) = D'^{x,t) and 
d{x,Xs) = D~'{x, s). The minimality of xt,Xs gives 



2t - •' ^ 2t 
d'^{xs,x) , . d'^{xt,x) 



fi^s) + ^TT^ < f{xt) + 



2s - ^ 2s 
Adding up and using the fact that j > ^ we deduce 

D^{x^t) = A{xt,x) < d(xs,x) = D~{x,s), 

which is (3.3). 

Combining this with the inequality < we immediately obtain that both functions 
are nonincreasing. At a point of right continuity of D~{x, ■) we get 

D^{x,t) < inf D~{x,s) = D-{x,t). 

s>t 

This implies that the two functions coincide out of a countable set. □ 
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Next, we examine the semicontinuity properties of D . These properties imply that points 
{x,t) where the equahty D~^{x,t) = D~{x,t) occurs are continuity points for both D'^ and 
D-. 

Proposition 3.2 (Semicontinuity of D^) The map is upper semicontinuous and the 
map D~ is lower semicontinuous in X x (0,oo). 

Proof We prove lower semicontinuity of D~ , the proof of upper semicontinuity of D'^ being 
similar. Let {xi,ti) be any sequence converging to {x, t) and, for every i, let be a minimum 
of F{ti,Xi, •) for which d(?/j,Xj) = D~{xi,ti). For all i we have 

Moreover, the continuity of {x,t) i— Qtf{x) gives that lim^ (5t,/(xj) = Qtf{x), thus 

lim fiy^) + = Qtfix). 

This means that {yi) is a minimizing sequence for F{t,x, •). Since {X,d) is compact, possibly 
passing to a subsequence, not relabeled, we may assume that (yj) converges to y as i ^ oo. 
Therefore 

D~{x,t) < d{x,y) = lim d(x,yi) = lim D~{xi,ti). 

□ 



Proposition 3.3 (Time derivative of Qtf) The map 1 1— )• Qtf is Lip schitz from [0,oo) to 
C{X) and, for all x £ X , it satisfies 

^^Qtfix) = (3.4) 

for any t > with at most countably many exceptions. 

Proof Let t < s and xt, Xg be minima of F{t, x, •) and F{s, x, •). We have 

d^(x, xt)t — s 



Qsf{x) - Qtf{x) < F{s, X, Xt) - F{t, X, Xt) ■■ 
Qsf{x) - Qtfix) > F{s, X, Xs) - F{t, X, Xs) 



2 ts ' 
d^(x, Xg) t — s 
2 



which gives that 1 1— t- Qtfix) is Lipschitz in (e, +oo) for any e > and x £ X. Also, dividing 
by (s — t) and taking Proposition 3.1 into account, we get (3.4). Now notice that from (3.2) 
we get that \-^Qtfix)\ < 2Lip^(/) for any x and a.e. t, which, together with the pointwise 
convergence of Qtf to / as 1 1 0, yields that 1 1— >• Qtf G CiX) is Lipschitz in [0, oo). □ 

Proposition 3.4 (Bound on the local Lipschitz constant of Qtf) For ix,t) S X x 
(0, oo) it holds: 

D+(r t) 

\DQtf\ix)<^±^. (3.5) 
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Proof Fix X £ X and t G (0,oo), pick a sequence (xj) converging to x and a corresponding 
sequence (yj) of minimizers for F{t,Xi,-) and similarly a minimizer y of F{t,x,-). We start 
proving that 

Qtfjx) - Qtfjxj) ^ D+{x,t) 
lim < . 

i—>-oo a[X,Xi) t 

Since it holds 

Qtfix) - Qtf{xi) < F{t,x,y,) - F{t,x,,y,) < f{y,) + - f{y^ ^'^^-^^^ 



2t ' ' 2t 
< (d(x, m) + d(x„ y.)) < (d(x, X,,) + 2D+ix„t)) , 

dividing by d{x,Xi), letting i — )• oo and using the upper semicontinuity of we get the 
claim. To conclude, we need to show that 

jr^ QtfjXj) - Qtfjx) ^ D+{x,t) 



d{x,Xi) t 
This follows along similar lines starting from the inequality 

Qtf{xi) - Qtf{x) < F{t,Xi,y) - F{t,x,yi). 

□ 



Theorem 3.5 (Subsolution of HJ) For every x £ X it holds 

^Qi/(x) + ^|Z)gj|2(x)<0 (3.6) 
with at most countably many exceptions in (0, oo). 

Proof The claim is a direct consequence of Proposition 3.3 and Proposition 3.4. □ 

We just proved that in an arbitrary metric space the Hopf-Lax formula produces subso- 
lutions of the Hamilton-Jacobi equation. Our aim now is to prove that if {X, d) is a geodesic 
space, then the same formula provides also supersolutions. 

Theorem 3.6 (Supersolution of HJ) Assume that (X, d) is a geodesic space. Then equal- 
ity holds in (3.5). In particular, for all x £ X it holds 

^^Qtfix) + ^\DQtf\Hx) = 0, 
with at most countably many exceptions in {0,oo). 

Proof Let y be a minimum of F{t,x, •) such that d(x,y) = D^{x,t). Let 7 : [0, 1] — )• X be a 
constant speed geodesic connecting x to y. We have 

Qjix) - Q./(7.) > m + - /(.) - %^ 

_ d^(x,y)-d^(7.,y) _ {D+ix,t))\2s-s^) 
2t 2t 
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Therefore we obtain 

^ Qtfjx) - Qtfils) _ ^ Qtf{x) - Qtf jjs) > D+{x,t) 
sio d(x,7s) sio sD+{x,t) ~ t 

Since s i— )• 7^ is a particular family converging to x we deduce 

\D Qtf\{x) > . 

Taking into account Proposition 3.3 and Proposition 3.4 we conclude. □ 



4 Weak definitions of gradient 

In this section we introduce two weak notions of 'norm of the differential', one inspired by 
Cheeger's seminal paper [9], that we call minimal relaxed slope and denote by |-D/|*, and one 
inspired by the papers of Koskela-MacManus [20] and of Shanmugalingam [30], that we call 
minimal weak upper gradient and denote by \Df\^. Notice that, as for the slopes, the objects 
that we are going to define are naturally in duality with the distance, thus are cotangent 
notion: that's why we use the 'D' instead of the 'V in the notation. Still, we will continue 
speaking of upper gradients and their weak counterparts to be aligned with the convention 
used in the literature (see [13] for a broader discussion on this distinction between tangent 
and cotangent objects and its effects on calculus). 

We compare our concepts with those of the original papers in Subsection 4.4, where we 
show that all these approaches a posteriori coincide. As usual, we will adopt the simplifying 
assumption that (X, d,m) is compact and normalized metric measure space, i.e. {X,d) is 
compact and m G J^{X). 

4.1 The "vertical" approach: minimal relaxed slope 

Definition 4.1 (Relaxed slopes) We say that G G L^(X, m) is a relaxed slope of f £ 
L^(X, m) if there exist G € L^(X, m) and Lipschitz functions /„ : X — ?• M such that: 

(o-) fn^f in L?'{X,xn) and |-D/n| weakly converges to G in L^(X, m); 

(b) G <G m-a.e. in X. 

We say that G is the minimal relaxed slope of f if its L^(X, m) norm is minimal among 
relaxed slopes. We shall denote by \Df\^: the minimal relaxed slope. 

Using Mazur's lemma and (2.2a) (see Proposition 4.3) it is possible to show that an 
equivalent characterization of relaxed slopes can be given by modifying (a) as follows: G is 
the strong limit in L^{X,m) of Gn > \Dfn\. The definition of relaxed slope we gave is useful 
to show existence of relaxed slopes (as soon as an approximating sequence (/„) with |-D/n| 
bounded in (X, m) exists) while the equivalent characterization is useful to perform diagonal 
arguments and to show that the class of relaxed slopes is a convex closed set. Therefore the 
definition of |-D/|* is well posed. 

Lemma 4.2 (Locality) Let Gi, G2 be relaxed slopes of f. Then m.m{Gi,G2} is a relaxed 
slope as well. In particular, for any relaxed slope G it holds 

\Df\* < G m-a.e. in X. 
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Proof It is sufficient to prove tliat if C X is a Borel set, then XbGi + Xx\bG2 is a relaxed 
slope of /. By approximation, taking into account the closure of the class of relaxed slopes, 
we can assume with no loss of generality that B is an open set. We fix r > and a Lipschitz 
function cpr '■ X ^ [0, 1] equal to on X \ Br and equal to 1 on B2r, where the open sets 
Bg C B are defined by 

Bs:={x£X : dist(x, X\B) > s} C B. 

Let now i = 1, 2, be Lipschitz and functions converging to / in m) as n — ?• oo, 

with \DfnA\ weakly convergent to Gi and set fn_ := </'r/n,i + (l-0r)/n,2- Then, \Dfn\ = \Dfn,i\ 
on B2r and \Dfn\ = \Dfn,2\ on X \ Br] in Br \ B2r, by applying (2.2a) and (2.2b), we can 
estimate 

\DU\ < \Dfn,2\+Up{cPr)\fn,l " /n,2 1 + 0r ( l^^/n.l | + |^/n„2|). 

Since Br C B , hy taking weak limits of a subsequence, it follows that 

XB2rGi + Xx\b;G2 + XB\B2riGi + 2G2) 

is a relaxed slope of /. Letting r | gives that XbGi + Xx\bG2 is a relaxed slope as well. 

For the second part of the statement argue by contradiction: let G be a relaxed slope 
of / and assume that B = {G < \Df\^:} is such that m{B) > 0. Consider the relaxed 
slope GXb + \Df\^Xx\B- its norm is strictly less than the norm of \Df\^, which is a 
contradiction. □ 

A trivial consequence of the definition and of the locality principle we just proved is that 
if / : X — )■ M is Lipschitz it holds: 

\DfU < \Df\ m-a.e. in X. (4.1) 

We also remark that it is possible to obtain the minimal relaxed slope as strong limit in 
of slopes of Lipschitz functions, and not only weak, as shown in the next proposition. 

Proposition 4.3 (Strong approximation) If f G L'^{X,m) has a relaxed slope, there ex- 
ist Lipschitz functions fn convergent to f in L^(X, m) with \Dfn\ convergent to \Df\^ in 
L2(X,m). 

Proof li gi ^ f in and \Dgi\ weakly converges to |-D/|* in L^, by Mazur's lemma we 
can find a sequence of convex combinations of \Dgi\ strongly convergent to |-D/|* in L^; the 
corresponding convex combinations of gi, that we shall denote by /„, still converge in L2 to 
/ and \Dfn\ is dominated by the convex combinations of \Dgi\. It follows that 

m [ |I)/„|2dm< lii^^ / \Dgi\^ dm = [ \Df\ldm. 

This implies at once that \Dfn\ weakly converges to |-D/|* (because any limit point in the 
weak topology is a relaxed slope with minimal norm) and that the convergence is strong. 

□ 

Theorem 4.4 The Cheeger energy functional 

Chif):=y^\Df\ldm, (4.2) 
set to +00 if f has no relaxed slope, is convex and lower semicontinuous in L'^{X,m). 
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Proof A simple byproduct of condition (2.2a) is that aF + f3G is a relaxed slope of af + f3g 
whenever a, (3 are nonnegative constants and F, G are relaxed slopes of /, g respectively. 
Taking F = \Df\^ and G = yields the convexity of Ch, while lower semicontinuity 

follows by a simple diagonal argument based on the strong approximation property stated in 
Proposition 4.3. □ 

Proposition 4.5 (Chain rule) If f £ L'^(X,m) has a relaxed slope and cp : X ^ W is 

Lipschitz and , then |D(/)(/)|* = \cl)'{f)\\Df\^ m-a.e. in X. 

Proof We trivially have \D(j){f)\ < \(p'{f)\\Df\. If we apply this inequality to the "opti- 
mal" approximating sequence of Lipschitz functions given by Proposition 4.3 we get that 
|0'(/)||^/l* is a relaxed slope of </>(/), so that \D(j){f)U < \(j)'{f)\\Df\^ m-a.e. in X. Ap- 
plying twice this inequality with (j){r) := —r we get |-D/|* < |D(— /)|* < \Df\^ and thus 
\Df\^ = \D{-f)\^ m-a.e. in X. 

Up to a simple rescaling, we can assume \(p'\ < 1. Let ipi{z) := z — (p{z), notice that 
^[ > and thus m-a.e. on f"^{{(p' > 0}) it holds 

\DfU < \Dim)\* + \DiMf))U < 'P'if)\DfU + i^'i{f)\DfU = \DfU, 

hence all the inequalities must be equalities, which forces |D(0(/))|* = 0'(/)|D/|* m-a.e. on 
f~^{{4>' > 0}). Similarly, let ■02 (-z) = —z — (j){z) and notice that tp2 < 0, so that m-a.e. on 
/-!({(/.' < 0}) it holds 

\DfU = \Di-f)U < \D{<P{f))U + \D{Mf))U < -cp'iniDfU - i^2if)\DfU = \DfU. 

As before we can conclude that \D{(f){f))\^ = -(p' {f)\Df\^ m-a.e. on f^^{{(f>' < 0}). □ 

Still by approximation, it is not difficult to show that (j){f) has a relaxed slope if (j) is 
Lipschitz, and that \D(p{f)\^ = |(/)'(/)||Z)/|* m-a.e. in X. In this case </>'(/) is undefined at 
points X such that cf) is not differentiable at f{x), on the other hand the formula still makes 
sense because = m-a.e. on f~^{N) for any Lebesgue negligible set C M. Particularly 

useful is the case when (j) is a truncation function, for instance (j){z) = mm{z,M}. In this 
case 

|Z^mm{/,Af}|. = |^ .f/(.)>M. 
Analogous formulas hold for truncations from below. 



4.1.1 Laplacian: definition and basic properties 

Since the domain of Ch is dense in L^(A, m) (it includes Lipschitz functions), the Hilbertian 
theory of gradient flows (see for instance [8], [3]) can be applied to Cheeger's functional (4.2) 
to provide, for all /o G L^(X, m), a locally Lipschitz continuous map t ^ ft from (0,oo) to 
L^(X, m), with /f — 5- /o as f 4, 0, whose derivative satisfies 

^Jt G -dChift) for a.e. t. (4.3) 

at 

Here dCh(g) denotes the subdifferential of Ch at g G -D(Ch) in the sense of convex analysis, 
i.e. 

dChig) := 1^ G L2(A,m) : Ch(/) > Chig) + j JU - 5) dm V/ G L\X,m)^ . 
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Another important regularizing effect of gradient flows of convex l.s.c. functionals lies in the 
fact that for every t > (the opposite of) the right derivative —-^ft = lim/i|o ^(/t — ft+h) 
exists and it is actually the element with minimal L'^{X, m) norm in d~Q\\{ft). This motivates 
the next definition: 

Definition 4.6 (Laplacian) The Laplacian A/ of f £ L'^{X,m) is defined for those f such 
that dCh{f) 7^ 0. For those f, —A/ is the element of minimal L'^(X,m) norm in dCh{f). 
The domain of A is defined as D{A). 

Remark 4.7 (Potential lack of linearity) It should be observed that in general the 
Laplacian - as we just defined it - is not a linear operator: the potential lack of linearity 
is strictly related to the fact that potentially the space d, m) is not Hilbert, because 

/ / \ Df\l dm need not be quadratic. For instance if X = M^, m is the Lebesgue measure 
and d is the distance induced by the L°° norm, then it is easily seen that 



\Df\t 



df 



dx 



+ 



df 



dy 



Even though the Laplacian is not linear, the trivial implication 

ved-C.\\{f) =^ Xved-Ch{Xf), VAgM, 
ensures that the Laplacian (and so the gradient flow of Ch) is 1-homogenous. 



We can now write 



for gradient flows ft of Ch, the derivative being understood in L^(X, m), in accordance with 
the classical case. The classical Hilbertian theory of gradient flows also ensures that 



lim Ch(/t) = and 



d 



Ch(/0 



|A/t 



for a.e. t G (0, oo) 



Proposition 4.8 (Integration by parts) For all f G ^(A), g G D{Ch) it holds 



gAfdm 



X 



< [ \DgU\DfUdm. 

JX 



(4.4) 



(4.5) 



Also, let f G ^(A) and (/) G C^(]R) with bounded derivative on an interval containing the 
image of f . Then 

[ 0(/)A/dm = -/ \Df\lcP'{f)dm. (4.6) 
Jx Jx 

Proof Since - A/ G d'Chif) it holds 

Ch(/)- / £gAfdm<Ch{f + eg), \JgeL\X,m), e G M. 
Jx 

For e > 0, |-D/|* + £\Dg\^, is a relaxed slope oi f + eg (possibly not minimal). Thus it holds 
2Ch(/ + eg) < /^(|D/|* + e\Dg\^f dm and therefore 

-/ egAfdm<\f {\Dfl + e\Dglf -\Df\ldm = e f \Dfl\DgUdm + o{e). 

JX ^ JX JX 
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Dividing by e, letting e ^ and then repeating the argument with —g in place of g we get 
(4.5). 

For the second part we recall that, by the chain rule, \D{f + e(j){f))\^ = (1 + £(/)'(/) 
for |e| small enough. Hence 

Ch(/ + £</.(/)) -Ch(/) = i / \Df\l{il +e<P' if ))^ - 1) dm = e [ \Df\lcP'if)dm + o{e), 

which implies that for any v G 9~Ch(/) it holds dm = j-^ \Df\1(j)'{f) dm, and gives 

the thesis with v = —A/. □ 

Proposition 4.9 (Some properties of the gradient flow of Ch) Let Jq G L^(X, m) and 
let [ft) he the gradient flow of Ch starting from /q. Then the following properties hold. 
Mass preservation. J ft dm = j fo dm for any t > 0. 

Maximum principle. If fo 1^ C (resp. fo>c) m-a.e. in X, then ft<C (resp ft > c) m-a.e. in 
X for any t > 0. 

Entropy dissipation. Suppose 0<c</o<C<oo m-a.e.. Then t J ftlog ft dm is 
absolutely continuous in [0, oo) and it holds 

^^/tlog/tdm = - jj^^^^ dm, for a.e. t G (0,oo). 

Proof 

Mass preservation. Just notice that from (4.5) we get 

A /" ft dm = [ 1- Aftdm < [ |Z)l|*|L>/t|* dm = 0, for a.e. t G (0, oo), 
dt Jx Jx Jx 

where 1 is the function identically equal to 1, which has minimal relaxed gradient equal to 0. 
Maximum principle. Fix / G L^(X, m), r > and, according to the implicit Euler scheme, 
let be the unique minimizer of 

g ^ Ch(5) + ^^|<7-/Pdm. 

Assume that f < C. We claim that in this case < C as well. Indeed, if this is not the case 
we can consider the competitor g := min{/'^,C} in the above minimization problem. By (a) 
of Proposition 4.5 we get Ch{g) < Ch(/'^) and the distance of / and g is strictly smaller 
than the one of / and as soon as m{{f'^ > C}) > 0, which is a contradiction. 

Starting from /o, iterating this procedure, and using the fact that the implicit Euler 
scheme converges as r | (see [8], [3] for details) to the gradient flow we get the conclusion. 
The same arguments applies to uniform bounds from below. 

Entropy dissipation. The map z i— t- zlogz is Lipschitz on [c,C] which, together with the 
maximum principle and the fact that t i-^ ft £ L'^{X,m) is locally absolutely continuous, 
yields the claimed absolute continuity statement. Now notice that we have ^ J ft log ft dm = 
J (log ft + l)A/tdm for a.e. t. Since by the maximum principle ft>c m-a.e., the function 
logz + 1 is Lipschitz and on the image of ft for any t > 0, thus from (4.6) we get the 
conclusion. □ 



16 



4.2 The "horizontal" approach: weak upper gradients 

In this subsection, following the approach of [4, 5], we introduce a different notion of "weak 
norm of gradient" in a compact and normalized metric measure space {X, d , m) . This notion 
of gradient is Lagrangian in spirit, it does not require a relaxation procedure, it will provide 
a new estimate of entropy dissipation along the gradient flow of Ch, and it will also be useful 
in the analysis of the derivative of the entropy along Wasserstein geodesies. 

While the definition of minimal relaxed slope was taken from Cheeger's work [9], the 
notion we are going to introduce is inspired by the work of Koskela-MacManus [20] and 
Shanmugalingam [30] , the only difference being that we consider a different notion of null set 
of curves. 

4.2.1 Negligible sets of curves and functions Sobolev along a.e. curve 

Recall that the evaluation maps et : C([0, 1],X) — )• X are defined by etij) := 7t. We also 
introduce the restriction maps restrf : C([0, 1],^) — ?• C{[0, 1],^), < t < s < 1, given by 

restrf(7),. := 7((i_r)t+rs), (4.7) 

so that restrf restricts the curve 7 to the interval [t, s] and then "stretches" it on the whole 
of [0, 1]. 

Definition 4.10 (Test plans and negligible sets of curves) We say that a probability 
measure tt G ^(C([0, 1], X)) is a test plan if it is concentrated on AC'^{[0,1]; X), 
IIo iTtj^didTT < 00, and there exists a constant C{7v) such that 

(et)tt7r < C(7r)m for every t £ [0, 1]. (4.8) 

A Borel set A C ylC^([0, 1],^) is said negligible if iri^A) = for any test plan n. A property 
which holds for every 7 G AC^([0, 1], X), except possibly a negligible set, is said to hold for 
almost every curve. 

Remark 4.11 An easy consequence of condition (4.8) is that if two m-measurable functions 
/, : X — 7- M coincide up to a m- negligible set and T is an at most countable subset of [0, 1], 
then the functions / o 7 and g o 'j coincide in T for almost every curve 7. 

Moreover, choosing an arbitrary test plan tt and applying Fubini's Theorem to the product 
measure x tt in (0, 1) x C([0, 1]; ^) we also obtain that / o 7 = o 7 ^^-a.e. in (0, 1) for 
TT-a.e. curve 7; since tt is arbitrary, the same property holds for almost every curve. 

Coupled with the definition of negligible set of curves, there are the definitions of weak upper 
gradient and of functions which are Sobolev along a.e. curve. 

Definition 4.12 (Weak upper gradients) A Borel function g : X ^ [0, 00] is a weak 
upper gradient 0/ / : X — t- M if 







for a.e. 7. 


(4.9) 


J a-/ 


J7 







Definition 4.13 (Sobolev functions along a.e. curve) A function / : X — )• M is Sobolev 
along a.e. curve if for a.e. curve 7 the function / 07 coincides a.e. in [0, 1] and in {0, 1} with 
an absolutely continuous map f^ : [0, 1] — )• M. 
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By Remark 4.11 applied to 7 := {0, 1}, (4.9) does not depend on the particular representative 
of / in the class of m- measurable function coinciding with / up to a m-negligible set. The 
same Remark also shows that the property of being Sobolev along almost every curve 7 is 
independent of the representative in the class of m-measurable functions coinciding with / 
m-a.e. in X. 

In the following remarks we will make use of this basic calculus lemma: 

Lemma 4.14 Let / : (0, 1) — t- M Lebesgue measurable, q € [l,oo], g € L'^(0, 1) nonnegative 
be satisfying 

\fis)-f{t)\<\[ gir)dr\ for -a.e. is,t) G (0,1)^ . 

J s 

Then f G W^''^{0, 1) and \ f'\ < g a.e. in (0, 1). 

Proof It is immediate to check that / G L°°(0, 1). Let N C (0,1)^ be the if^-negligible 
subset where the above inequality fails. By Fubini's theorem, also the set {{t,h) E (0,1)^ : 
{t,t -\- h) € n (0, 1)^} is ^^-negligible. In particular, by Fubini's theorem, for a.e. h we 
have {t,t + h) ^ N for a.e. t G (0, 1). Let hi ^0 with this property and use the identities 

" Jo —h 

with G C^(0, 1) and h = hi sufficiently small to get 



f{m'{t)dt 



< / git)\<p{t)\dt. 



It follows that the distributional derivative of / is a signed measure rj with finite total variation 
which satisfies 

- f f4>'dt= [ (f>dr], [ (f>dr] < [ g\(f>\dt for every G Cc^(0, 1); 
Jo Jo Jo Jo 

therefore rj is absolutely continuous with respect to the Lebesgue measure with |r/| < g^^. 
This gives the W^'^{0,1) regularity and, at the same time, the inequality |/'| < g a.e. in 
(0,1). The case q > I immediately follows by applying this inequality when g G ^"^(0,1). 

□ 

With the aid of this lemma, we can prove that the existence of a weak upper gradient g such 
that J g < 00 for a.e. 7 (in particular if 5 G L^(X, m)) implies Sobolev regularity along a.e. 



n 
curve 



Remark 4.15 (Restriction and equivalent formulation) Notice that if tt is a test plan, 
so is (restrf)u7r. Hence if g is a weak upper gradient of / such that J^g < 00 for a.e. 7, then 
for every t < s in [0, 1] it holds 

\fils) - /(7t)l < 9{lr)\ir\ dr for a.e. 7. 

Let TT be a test plan: by Fubini's theorem applied to the product measure x tt in (0, 1)^ x 
C([0, 1];^), it follows that for vr-a.e. 7 the function / satisfies 

\f{ls)-f{lt)\< f g{lr)\ir\dr for if 2.a.e. (t,c,) G (0,1)2. 
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An analogous argument shows that 

ll/(7l)-/(7.)l<L9(7.)l>|dr 

Since g o £ L^{0, 1) for vr-a.e. 7, by Lemma 4.14 it follows that / o 7 e Ty^'^(0, 1) for 
TT-a.e. 7, and 

^(/°7) 1^ 9 ° llil a.e. in (0, 1), for 7r-a.e. 7. (4-11) 

Since tt is arbitrary, we conclude that / 07 g VF^'^(0, 1) for a.e. 7, and therefore it admits an 
absolutely continuous representative moreover, by (4.10), it is immediate to check that 
/(7(t)) = f^{t) for t G {0, 1} and a.e. 7. ■ 

Remark 4.16 (An approach with a non explicit use of negligible set of curves) 

The previous remark could be used to introduce the notion of weak upper gradients without 
speaking (explicitly) of Borel sets at all. One can simply say that g G L^(X, m) is a weak 
upper gradient of / : X — )• M provided it holds 

1/(71) - /(7o)| d7r(7) < [[ 5(7^)|7s| dsd7r(7). 

(this has been the approach followed in [13]). 







Proposition 4.17 (Locality) Let / : X — )■ M be Sobolev along almost all absolutely contin- 
uous curves, and let Gi, G2 be weak upper gradients of f . Then min{Gi, G2} is a weak upper 
gradient of f. 

Proof It is a direct consequence of (4.11). □ 



Definition 4.18 (Minimal weak upper gradient) Let f : X ^ be Sobolev along al- 
most all curves. The minimal weak upper gradient of f is the weak upper gradient 
characterized, up to m-negligible sets, by the property 

\Df\w < G m-a.e. in X, for every weak upper gradient G of f . (4-12) 

Uniqueness of the minimal weak upper gradient is obvious. For existence, we take \Df\w := 
inf„ G„, where G„ are weak upper gradients which provide a minimizing sequence in 

infjy^ tan~^Gdm: G is a weak upper gradient of / > . 

We immediately see, thanks to Proposition 4.17, that we can assume with no loss of generality 
that Gn+i < Gn- Hence, by monotone convGrg6iic6, the function |Z)y||^ is a weak upper 
gradient of / and Jj^tan~^Gdm is minimal at G = \Df\yj. This minimality, in conjunction 
with Proposition 4.17, gives (4.12). 

Theorem 4.19 (Stability w.r.t. m-a.e. convergence) Assume that fn are m- 
measurable, Sobolev along almost all curves and that Gn are weak upper gradients of 
fn- Assume furthermore that fn{x) — ?• f{x) G M for m-a.e. x £ X and that (Gn) weakly 
converges to G in L'^{X,m). Then G is a weak upper gradient of f . 
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Proof Fix a test plan vr. By Mazur's theorem we can find convex combinations 

Nh+i Nh+i 
Hn := ^ aiGi with Oj > 0, ^ = 1, TV^ oo 

converging strongly to G in L'^{X, m). Denoting by /„ the corresponding convex combinations 
of fn, Hn are weak upper gradients of fn and still fn^f m-a.e. in X. 

Since for every nonnegative Borel function if : X ^ [0,oo] it holds (with C = C{7v)) 



1/2/ , . ,2 , Al/2 



(^)d7r = j (/J(7t)l7t|dt)d7r< J (^^(7t)dt) ' ^J^ \it\''dt) ' dvr 



<^ J </''d(et)s7rdt)'^'(^y'|7t|'dtd7r 



1/2 



< (C / (^2dm)'^'( jj\^t\^dtd7,)^'\ (4.13) 



, ,^1/2 

we obtain, for C := vC( JJ^ |7tpdtd7r 



y ^ - G| + min{|/„ - /1, 1}^ dTT < - G\\l2 + || min{|/„ - /|, IjHi^) ^ 0. 

By a diagonal argument we can find a subsequence n{k) such that — G| +min{ — 

/1, 1} — )• as /c — 7- oo for 7r-a.e. 7. Since /„ converge m-a.e. to / and the marginals of tt are 
absolutely continuous w.r.t. m we have also that for 7r-a.e. 7 it holds /n(7o) ~^ filo) 
hill) fill)- 

If we fix a curve 7 satisfying these convergence properties, since {fn{k))-y are equi-absolutely 
continuous (being their derivatives bounded by i^n(fc)°7l7l) ™d a further subsequence of fn{k) 
converges a.e. in [0, 1] and in {0, 1} to /(7s), we can pass to the limit to obtain an absolutely 
continuous function equal to /(7s) a.e. in [0, 1] and in {0, 1} with derivative bounded by 
G(7s)|7s|. Since tt is arbitrary we conclude that / is Sobolev along almost all curves and that 
G is a weak upper gradient of /. □ 



Remark 4.20 < An immediate consequence of the previous proposition is 

that any / G D{Ch) is Sobolev along a.e. curve and satisfies |-D/|io < l-C/l*- Indeed, for 
such / just pick a sequence of Lipschitz functions converging to / in L^(X, m) such that 
\Dfn\ — ^ l^/l* in L'^{X,xn) (as in Proposition 4.3) and recall that for Lipschitz functions the 
local Lipschitz constant is an upper gradient. ■ 



4.2.2 A bound from below on weak gradients 

In this short subsection we show how, using test plans and the very definition of minimal 
weak gradients, it is possible to use \Df\y^ to bound from below the increments of the relative 
entropy. We start with the following result, proved - in a more general setting - by Lisini in [22]: 
it shows how to associate to a curve £ AC'^{[0, 1]; (=^(X), VF2)) a plan tt E ^{C{[0, 1],X)) 
concentrated on j4C^([0, 1];-^) representing the curve itself (see also Theorem 8.2.1 of [3] for 
the Euclidean case). We will only sketch the proof. 
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Proposition 4.21 (Superposition principle) Let {X,d) be a compact space and let 
H e AC'^{[0,l];{^{X),W2)). Then there exists tt G ^(C([0, 1], X)) concentrated on 
AC'^{[0,1]; X) such that {et)^7r = fit for any t G [0,1] and j'|7(pd7r(7) = for a.e. 

i e [0,1]. 

Proof If TT G C([0, 1], ^) is any plan concentrated on y4C^([0, 1], ^) such that (e^ )jj7r = /i^ for 
any t G [0, 1], since (ei,es)}j7r G Adm(;U(, /is), for any t < s it holds 



W^{^it,^ls)< j d2(7t,7s)d7r(7) < j {^j^ \ir\dr^ An{^)<{s-t) j |7,|2 dr d7r(7), 

which shows that < J |7tpd7r(7) for a.e. t. Hence, to conclude it is sufficient to find a 
plan TT G =^(C([0, 1],X)), concentrated on AC'^{[i), 1],X), with {et^-K = ^it for any t G [0, 1] 
such that / l/itp dt > //J |7tp dt d7r(7). 

To build such a tt we make the simplifying assumption that (^, d) is geodesic (the proof 
for the general case is similar, but rather than interpolating with piecewise geodesic curves 
one uses piecewise constant ones, this leads to some technical complications that we want to 
avoid here - see [22] for the complete argument). Fix n G N and use a gluing argument to find 
7" G ^(X"+^) such that (vr*, 7r*+^)j7" G OPT(/ii, /x^+i ) for i = 0, . . . , n - 1. By standard 

measurable selection arguments, there exists a Borel map T" : X^^^ — t- C([0, 1],^) such that 
7 := T"-{xq, . . . ,Xn) is a constant speed geodesic on each of the intervals [i/n, {i + l)/n] and 
Ji/n = Xi, i = 0, . . . ,n. Define tt" := T^'j"'. It holds 



. n— 1 ^ n—1 



[[ |7i|'dtd7r"(7) = - /Vd2(7^,7^)d7r(7) = iVl^|(/i^,//^) < / \fit\' dt. 
■j-ju -^1=0 1=0 •^'J 

^ (4.14) 

Now notice that the map E : C([0, 1],X) — ^ [O, oo] given by £'(7) := |7tpdi if 7 G 
AC^([0, 1],X) and +00 otherwise, is lower semicontinuous and, via a simple equicontinuity 
argument, with compact sublevels. Therefore by Prokorov's theorem we get that (tt") C 
=^(C([0, 1],X)) is a tight sequence, hence for any limit measure tt the uniform bound (4.14) 
gives the thesis. □ 



Proposition 4.22 Let [0, 1] 3 t ^ fit = ftm be a curve in AC'^{[0, 1], {^{X), W2)). Assume 
that for some 0<c<C<ooit holds c < ft < C m-a.e. for any t G [0, 1], and that /q is 
Sobolev along a.e. curve with |D/o|tu G L^(X, m). Then 

[ /olog/odm- / ftlogftdm<l f [ \£^f,dsdm+l f\fis\^ds, Vf > 0. 
Jx J X ^ Jo Jx Jo ^ JO 

Proof Let tt G ^{C{[0, 1],X)) be a plan associated to the curve (fit) as in Proposition 4.21. 
The assumption ft<C m-a.e. and the fact that J |7t p dt d7r(7) = J |/tt p dt < 00 guarantee 
that TT is a test plan. Now notice that it holds \D\og ftlw = \Dft\w/ft (because z 1— )• logz is 
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in [c, C] ) ) , thus we get 



fo log /o dm - ^ ft log ft dm < log /o(/o - ft) dm = j log /o o eo - log /o o e^^ d-Tr 



< II -j^hsldsd.i,) 







-/sdsdm+- / |/is| ds. 



2 Jo Jx fn 2 -/o 



□ 



4.3 The two notions of gradient coincide 

Here we prove that the two notions of "norm of weak gradient" we introduced coincide. We 
already noticed in Remark 4.20 that \Df\w < l-D/j^,, so that to conclude we need to show 
that \Df\^ > \DfU. 

The key argument to achieve this is the following lemma, which gives a sharp bound on 
the W2-speed of the L^-gradient flow of Ch. This lemma has been introduced in [15] to study 
the heat flow on Alexandrov spaces, see also Section 6. 

Lemma 4.23 (Kuv^ada's lemma) Let fo E L'^{X,m) and let (ft) be the L2-gradient flow 
of Ch starting from fo- Assume that for some < c < C < oo it holds c < /o < C m-a.e. in 
X, and that /odm = 1. Then the curve t ^ fit '■= ft^ is absolutely continuous w.r.t. W2 
and it holds 

|2 



AtP < / ^^f*^* dm, for a.e. t e {0,00). 
Jx ft 



Proof We start from the duality formula (2.5) with ip = —ip: taking into account the factor 
2 and using the identity Qi{—ip) = tp^ we get 



sup / Qitpdu— I ipdfi (4-15) 
Jx Jx 



where the supremum runs among all Lipschitz functions (p. 

Fix such a 93 and recall (Proposition 3.3) that the map t 1— )• Qt(p is Lipschitz with values 

in L°°{X,m), and a fortiori in L^(X, m). 

Fix also < t < s, set £ = {s — t) and recall that since {ft) is the Gradient Flow of Ch in 
the map 9 r I— 7- ft+r is absolutely continuous with values in L^. Therefore the map 

[0,^] 3 T 1-^ Qi_ip ft+T is absolutely continuous with values in L^. The equality 

Qr + hipft+T+h - fft+T Qr + h(p - Q I, if f,,^,h - fi^-r 

T = Jt+T 7 1- Ql±hip 7 ) 

n nth 

together with the uniform continuity of (x,r) 1— )• Qi.ip{x) shows that the derivative of r 1— )• 
Q up ft+T can be computed via the Leibniz rule. 
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We have: 

J Qiipdfis-J ipdfit = J Qiipft+edm- J ip ft dm = J J ^(Qrc^/^^^) dr dm 

-JxJ V ^/t+r + Qz^A/i+^jdrdm, 

(4.16) 

having used Theorem 3.5. Observe that by inequaUties (4.5) and (4.1) we have 
/ Q:_ip A ft+r dm < / \DQrip\^\Dft+rUdm< / \DQrip\ \Dft+rUdm 

(4-17) 

\DQr^\'h+rdm+^- [ ^^dm. 

'^^ Jx ^ JX Jt+T 

Plugging this inequahty in (4.16), we obtain 

/ Qi^d^is - [ ^df,t<^ f [ ^^^±^dm. 
Jx J X ^ Jo Jx Jt+T 

This latter bound does not depend on yj, so from (4.15) we deduce 

r \r,f |2 



JO Jx Jt+T 



Since fr ^ c for any r > and r i— t- Ch(/r) is nonincreasing and finite for every r > 0, 
we immediately get that t i— t- /i^ is locally Lipschitz in (0, oo). At Lebesgue points of t i— t- 
\Dft\1/ft dm we obtain the stated pointwise bound on the metric speed. □ 

Theorem 4.24 Let f G L'^{X,m). A ssume that f is Sobolev along a.e. curve and that 
\Df\^ G L2(X,m). Then f e D(Ch) and \Df\^ = \Df\^ m-a.e. in X. 

Proof Up to a truncation argument and addition of a constant, we can assume that < c < 
/ < C < oo m-a.e. in X for some c, C. Let {ft) be the L2-gradient flow of Ch starting from 
/ and recall that from Proposition 4.9 we have 

/ log / dm — I ft log ft dm = [ [ ^ ^ * ds dm < oo for every t > 0. 
X Jx Jo Jx fs 

On the other hand, from Proposition 4.22 and Lemma 4.23 we have 

ft r ir,fi2 T rt r I n f l2 



L 



/log/dm- / ftlogftdm<l- [ [ L^/,dsdm+J / / ^^dsdm. (4.18) 
Jx Jo Jx J ^ Jo Jx Is 



IX 

Hence we deduce 

r4Ch(v^)ds = - f [ \^^f^dsdm<- f [ \I^f,dsdm. 
Jo 2 Jq Jx fs Jo Jx f 

Letting t ^ 0, taking into account the L^-lower semicontinuity of Ch and the fact - easy to 
check from the maximum principle - that — )■ ^/J as s J, in L^(X, m), we get Ch(-^) < 
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lim^iQ J Jq Ch(v^)ds. On the other hand, the bound / > c > ensures G L^{X,m) 

and the maximum principle again together with the convergence of fs to / in L^(X, m) 
when s 4 grants that the convergence is also weak* in {X, m) , therefore ^^^n = 

ilim^o/o/x^/^dmds. 
In summary, we proved 

which, together with the inequality \Df\yu < \Df\^ m-a.e. in X, gives the conclusion. □ 

We are now in the position of defining the Sobolev space W^''^{X,d,m). We start with 
the following simple and general lemma. 

Lemma 4.25 Let (B, \\ ■ ||) be a Banach space and let E : B ^ [0,oo] he a 1 -homogeneous, 
convex and lower semicontinuous map. Then the vector space {E < oo} endowed with the 
norm 

\\v\\e := y^h¥Tmv), 

is a Banach space. 

Proof It is clear that {D{E), \\ ■ \\e) is a normed space, so we only need to prove completeness. 
Pick a sequence (fn) C D{E) which is Cauchy w.r.t. || • Then, since || • || < || • \\e we also 
get that {vn) is Cauchy w.r.t. || • ||, and hence there exists v G B such that ||f„ — v\\ — )• 0. 
The lower semicontinuity of E grants that E{v) < lim „ E(vn) < oo and also that it holds 

lim \\vn - v\\e < lim \\vn - v^We = 0, 

n— >oo n,m— >cxD 

which is the thesis. □ 

Therefore, if we want to build the space W^''^{X,d,xn) C L'^{X, m), the only thing that we 
need is an L^-lower semicontinuous functional playing the role which on is played by the 
L^-norm of the distributional gradient of Sobolev functions. We certainly have this functional, 
namely the map / i— )• |||-D/|^,||£2(j)f = |||-D/|i«||L2(jf „^). Hence the lemma above provides the 
Banach space W^''^{X,d,m). Notice that in general W^''^{X,d,m) is not Hilbert: this is not 
surprising, as already the Sobolev space W^'^ built over (M"^, || • Hj^*^) is not Hilbert if the 
underlying norm || • || does not come from a scalar product. 

4.4 Comparison with previous approaches 

It is now time to underline that the one proposed here is certainly not the first definition of 
Sobolev space over a metric measure space (we refer to [17] for a much broader overview on 
the subject). Here we confine the discussion only to weak notions of (modulus of) gradient, 
and in particular to [9] and [20, 30]. Also, we discuss only the quadratic case, referring to 
[5] for general power functions p and the independence (in a suitable sense) of p of minimal 
gradients. 

In [9] Cheeger proposed a relaxation procedure similar to the one used in Subsection 4.1, 
but rather than relaxing the local Lipschitz constant of Lipschitz functions, he relaxed upper 
gradients of arbitrary functions. More precisely, he defined 

E{f) :=inf lim \\Gn\\L^x,m)^ 
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where the infimum is taken among all sequences (/„) converging to / in L^(X, m) such that 
Gn is an upper gradient for /„. Then, with the same computations done in Subsection 4.1 
(actually and obviously, the story goes the other way around: we closely followed his argu- 
ments) he showed that for / G D[E) there is an underlying notion of weak gradient \Df\c-, 
called minimal generalized upper gradient, such that E{f) = |||i^/|cllL2(jf,m) ^^i"^ 

\Df\c < G m-a.e. in X, 

for any G weak limit of a sequence (Gn) as in the definition of E{f). 

Notice that since the local Lipschitz constant is always an upper gradient for Lipschitz 
functions, one certainly has 

\Df\c < \DfU m-a.e. in X, for any / G D{Ch). (4.19) 

Koskela and MacManus [20] introduced and Shanmugalingam [30] further studied a proce- 
dure close to ours (again: actually we have been inspired by them) to produce a notion of 
"norm of weak gradient" which does not require a relaxation procedure. Recall that for 
r C AC{[0, 1],X) the 2-Modulus Mod2(r) is defined by 

Mod2(r) :=inf{||p||22(^,^) : J^p>l V7 G t} for every T C AC([0, 1], X). (4.20) 

It is possible to show that the 2-Modulus is an outer measure on j4C([0, 1], -'^). Building on this 
notion, Koskela and MacManus [20] considered the class of functions / which satisfy the upper 
gradient inequality not necessarily along all curves, but only out of a Mod2-negligible set of 
curves. In order to compare more properly this concept to Sobolev classes, Shanmugalingam 
said that G : X — )■ [0, 00] is a weak upper gradient for / if there exists f = f m-a.e. such that 

1/(70) - /(7i)| < [ G for every 7 E AC{[0, 1],X) \ N with Mod2(N) = 0. 

Then, she defined the energy E : L^{X,m) [0,oo] by putting 

Eif) :=inf||G||i.(,^^„), 

where the infimum is taken among all weak upper gradient G of / according to the previous 
condition. Thanks to the properties of the 2-modulus (a stability property of weak upper 
gradients analogous to ours), it is possible to show that E is indeed L^-lower semicontinuous, 
so that it leads to a good definition of the Sobolev space. Also, using a key lemma due to 
Fuglede, Shanmugalingam proved that E = E on L'^{X,m), so that they produce the same 
definition of Sobolev space W^''^{X,d,m) and the underlying gradient \Df\s which gives a 
pointwise representation to E{f) is the same \Df\c behind the energy E. 

Observe now that for a Borel set T C AC^([0, l],-'^) and a test plan tt, integrating w.r.t. 
TV the inequality f^p > 1 V7 E F and then minimizing over p, we get 

[7r(r)]^ < (:7(7r)Mod2(r) jj^ |7|2dsd7r(7), 

which shows that any Mod2-negligible set of curves is also negligible according to Defini- 
tion 4.10. This fact easily yields that any / E D[E) is Sobolev along a.e. curve and satisfies 

\DfU < \Df\c, m-a.e. in X. (4.21) 
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Given that we proved in Theorem 4.24 that |-D/|* = \Df\w, inequalities (4.19) and (4.21) 
also give that |-D/|* = \Df\w = \Df\c = \Df\s (the smallest one among the four notions 
coincides with the largest one). 

What we get by the new approach to Sobolev spaces on metric measure spaces is the 
following result. 

Theorem 4.26 (Density in energy of Lipschitz functions) Let {X,d,m) be a compact 
normalized metric measure space. Then for any f G L^{X,m) with weak upper gradient in 
L'^{X,m) there exists a sequence {fn) of Lipschitz functions converging to f in L'^{X,m) such 
that both \Dfn\ and \Dfn\w converge to \Df\w in L'^{X,m) as n —t- oo. 

Proof Straightforward consequence of the identity of weak and relaxed gradients and of 
Proposition 4.3. □ 

Let us point out a few aspects behind the strategy of the proof of Theorem 4.26, which of 
course strongly relies on Lemma 4.23 and Proposition 4.22. First of all, let us notice that the 
stated existence of a sequence of Lipschitz function fn converging to / with \Dfn\ — \Df\w 
in {X, m) is equivalent to show that 

lim Y,/M) < / \Df \ldm, (4.22) 
where, for r > 0, 1^ denotes the Yosida regularization 

In fact, the sequence /„ can be chosen by a simple diagonal argument among the approximate 
minimizers of li/n(/)- On the other hand, it is well known that the relaxation procedure we 
used to define the Cheeger energy yields 

YyM)= min |ch(/i) + ^ / \h-f\^dm\, (4.23) 

hgD{Ch) [ 2 Jx J 

and therefore (4.22) could be achieved by trying to estimate the Cheeger energy of the unique 
minimizer /„ of (4.23) in terms of \Df\yj. 

Instead of using the Yosida regularization Yi/„, in the proof of Theorem 4.24 we obtained 
a better approximation of / by flowing it (for a small time step, say tn i 0) through the L^- 
gradient flow ft of the Cheeger energy. This flow is strictly related to Y^, since it can be 
obtained as the limit of suitably rescaled iterated minimizers of 1^ (the so called Minimizing 
Movement scheme, see e.g. [3]), but has the great advantage to provide a continuous curve of 
probability densities ft, which can be represented as the image of a test plan, through Lisini's 
Theorem. Thanks to this representation and Kuwada's Lemma, we were allowed to use the 
weak upper gradient \Df\^ instead of |-D/|* to estimate the Entropy dissipation along ft (see 
(4.18)) and to obtain the desired sharp bound of at least for some time s G (0, t„). In 

any case, a posteriori we recovered the validity of (4.22). 

This density result was previously known (via the use of maximal functions and covering 
arguments) under the assumption that the space was doubling and supported a local Poincare 
inequality for weak upper gradients, see [9, Theorem 4.14, Theorem 4.24]. Actually, Cheeger 
proved more, namely that under these hypotheses Lipschitz functions are dense in the W^''^ 



26 



norm, a result which is still unknown in the general case. Also, notice that another byprod- 
uct of our density in energy result is the equivalence of local Poincare inequality stated for 
Lipschitz functions on the left hand side and slope on the right hand side, and local Poincare 
inequality stated for general functions on the left hand side and upper gradients on the right 
hand side; this result was previously known [19] under much more restrictive assumptions on 
the metric measure structure. 

5 The relative entropy and its VK2-gradient flow 

In this section we study the VF2-gradient flow of the relative entropy on spaces with Ricci 
curvature bounded below (in short: CD{K, oo) spaces). The content is essentially extracted 
from [12]. As before the space (X, d,m) is compact and normalized (i.e. m(X) = 1). 
Recall that the relative entropy functional Ent^ : ^{X) — )• [0, oo] is defined by 



Definition 5.1 (Weak bound from below on the Ricci curvature) We say that 
(X,d,m) has Ricci curvature bounded from below by K for some K £ M if the Relative 
Entropy functional Entm is K-convex along geodesies in {l3^{X),W2). More precisely, if for 
any jiQ, fii G -D(Entm) there exists a constant speed geodesic fit '■ [0, 1] — ^{X) between fiQ 
and Hi satisfying 



This definition was introduced in [23] and [31]. Its two basic features are: compatibility 
with the Riemannian case (i.e. a compact Riemannian manifold endowed with the normalized 
volume measure has Ricci curvature bounded below by K in the classical pointwise sense if 
and only if Entm is -fT-geodesically convex in {,^^{X),W2)) and stability w.r.t. measured 
Gromov-Hausdorff convergence. 

We also recall that Finsler geometries are included in the class of metric measure spaces 
with Ricci curvature bounded below. This means that if we have a smooth compact Finsler 
manifold (that is: a differentiable manifold endowed with a norm - possibly not coming from 
an inner product - on each tangent space which varies smoothly on the base point) endowed 
with an arbitrary positive C°° measure, then this space has Ricci curvature bounded below 
by some iiT G M (see the theorem stated at page 926 of [32] for the flat case and [24] for the 
general one). 

The goal now is to study the W2-gi'adient flow of Entm. Notice that the general the- 
ory of gradient flows of i^-convex functionals ensures the following existence result (see the 
representation formula for the slope (2.7) and Theorem 2.2). 

Theorem 5.2 (Consequences of the general theory of gradient fiows) Let {X,d,m) 
be a CD{K, oo) space. Then the slope |I?^Entni| is lower semicontinuous w.r.t. weak conver- 
gence and for any ft £ -D(Entm) there exists a gradient flow (in the EDE sense of Definition 
2.1) o/Entm starting from fi. 




Entm(^t) < (l-i)Entm(/io)+tEntm(/ii)-yt(l-t)W|(/io,/"i) Vt G [0,1]. 
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Thus, existence is granted. The problem is then to show uniqueness of the gradient flow. To 
this aim, we need to introduce the concept of push forward via a plan. 

Definition 5.3 (Push forward via a plan) Let ji G I^{X) and let 7 G ^(X^) he such 
that i^L <^ "^l^- The measures 7^ E ^(X'^) and 'y^j^L G 3^{X) are defined as: 



djf,ix,y) := -jp^(x)d7(x,7/), 7jj^ := -k^j^. 



Observe that, since 7^^ <^ 7, we have -y^fi <^ '^1'^- ^^^^ ih.ai 7 has bounded deformation 
if there exist < c < C < 00 such that cm < 7rj7 < Cm, i = 1,2. Writing /i = /7rj7, the 
definition gives that 

7ti/^ = ^TTjT with Tj given by r]{y) = j f{x) d7y(x), (5.1) 

where {7y}i/ex is the disintegration of 7 w.r.t. its second marginal. 

The operation of push forward via a plan has interesting properties in connection with 
the relative entropy functional. 

Proposition 5.4 The following properties hold: 

(i) For any fi, u e ^{X), 7 G ^{X"^) such that fi, u <^ tt^j it holds 

Ent-^jj^(7j/i) < Ent^(^). 

(a) For fi G L'(Entm) and 7 G ^(X^) with bounded deformation, it holds "y^fi G L'(Entm). 
(Hi) Given 7 G ^(X^) with bounded deformation, the map 

D(Entn,) 9 /i ^ Entm(^) - Entm(7j^), 
is convex (w.r.t. linear interpolation of measures). 

Proof 

(i). We can assume /.t <^ v, otherwise there is nothing to prove. Then it is immediate to check 
from the definition that 7^^ <^ "y^v. Let /i = fv, v = ^vrj7, 7^/2 = r/7jz^, and n(z) := zlogz. 
By disintegrating 7 as in (5.1), we have that 

= y /(a^)d7y(x), ^y = {j '^'yyi^)) ^ly 
The convexity of u and Jensen's inequality with the probability measures 7^ yield 

u{r]{y)) < / u{f{x))d^y{x). 



Since {7y}yex is the disintegration of 7 = (0 o 71^)7 with respect to its second marginal 'jf^v 
and the first marginal of 7 is z^, by integration of both sides with respect to 7jjZ^ we get 



Ent-yjjj, 



(TftA*) = j u{r]iy))d'y^i^{y) < j [j u{f{x))djy{x)^ d-y^i^iy) 
< / u{f{x))d'y{x,y) = / u{f{x)) du{x) = Ent^{fi). 
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(ii) . Taking into account the identity 

Ent^ifj.) = Ent,(/x) + j log dfi, (5.2) 

valid for any fi, u,a ^ J3^{X) with a having bounded density w.r.t. u, the fact that 7j(vrj7) = 
TTj 7 and the fact that cm < 7, vr^ 7 < Cm, the conclusion follows from 

Entm(7j^) < Ent^2^(7j/i) + log C < Ent„i^(^) + log C < EntM + log C - log c. 

(iii) . Let fiQ, fii G D(Entn,) and define fit '■= (1 — t)fio + tfii and ut := f^lJ-t- A direct 
computation shows that 

(1 - t)Entm(;Uo) + tEntm(^fi) - Entm(^t) = (1 - t)Ent^^{fio) + tEnt^,(/ii), 
(1 - t)Entn,(i/o) + tEntm(zyi) - Entn,(i/t) = (1 - t)Ent^,(i/o) + tEnt^,(z^i), 

and from (i) we have that 

Ent^,(/ii) > Ent-^j,^,,(7j,^i) = Ent,,,(fi), Vt G [0, 1], i = 0, 1, 

which gives the conclusion. □ 
In the next lemma and in the sequel we use the short notation 



(7(7) := / d {x,y)d'y{x,y). 
Jxxx 

Lemma 5.5 (Approximability in Entropy and distance) Let /U, G L'(Entm). Then 
there exists a sequence (7") of plans with bounded deformation such that Entnx(7j*//) ~^ 
Entm(z^) and C(7]]) — ;> VF^ (z^*' ^) as n ^ 00. 

Proof Let / and g respectively be the densities of /_f and w.r.t. m; pick 7 G Opt(^, ly) and, 
for every n G N, let An := {{x,y) : f{x) + g{y) < n} and 

7n :=Cn (^7U„ + ^(Id,Id)Bm 

where — )• 1 is the normalization constant. It is immediate to check that 7„ is of bounded 
deformation and that this sequence satisfies the thesis (see [12] for further details). □ 

Proposition 5.6 (Convexity of the squared slope) Let {X,d,m) be a CD{K, 00) space. 
Then the map 

L'(Entm) 9 Ai ^ |i:>~Entm|^(/i) 
is convex (w.r.t. linear interpolation of measures). 

Notice that the only assumption that we make is the il'-convexity of the entropy w.r.t. 
W2, and from this we deduce the convexity w.r.t. the classical linear interpolation of measures 
of the squared slope. 
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Proof Recall that from (2.7) we know that 

\D~Entxn\{fJ') = sup 
We claim that it also holds 

|I?~Entm| (^) = sup 



EntM - EntM - ^W^n, u) 



where the supremum is taken among all plans with bounded deformation (where the right 
hand side is taken by definition if C(7^) > 0). 

Indeed, Lemma 5.5 gives that the first expression is not larger than the second. For the 
converse inequality we can assume C(7^) > 0, = 7n/U ^ fi, and K < 0. Then it is sufficient 
to apply the simple inequality 



a, 6, c G M, < 6 < c 



(a -6)+ ^ (a-c)+ 



Vb 



with a := Entm(^) - Entn,(7j^), b := ^Wi{fi,-f^^fj.) and c := ^C(7^). 

Thus, to prove the thesis it is enough to show that for every 7 with bounded deformation 
the map 



D{Ent„) 3 fi 



1-^ 



[{EntM - Entn,(7tt/i) - ^C{j^)y 



C(7^) 



is convex w.r.t. linear interpolation of measures. 
Clearly the map 



L'(Ent^) 3 fi 



d^{x,y)d-f^{y) d^(x), 



where {7^;} is the disintegration of 7 w.r.t. its first marginal, is linear. Thus, from (in) of 
Proposition 5.4 we know that the map 

K~ 

H ^ Entm(M) - Ent,„(7j^) - — C(7^), 

is convex w.r.t. linear interpolation of measures. Hence the same is true for its positive part. 
The conclusion follows from the fact that the function ^ : [0, 00)^ — t- M U {+00} defined by 



if 6 > 0, 

if6 = 0,a>0 
if a = 6 = 0, 

□ 



is convex and it is nondecreasing w.r.t. a. 
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The convexity of the squared slope allows to prove uniqueness of the gradient flow of the 
entropy: 

Theorem 5.7 (Uniqueness of the gradient flow of Entn,) Let (X, d,m) be a CD{K, oo) 
space and let ^ G D(Entn,). Then there exists a unique gradient flow o/Entm starting from fi 
in {^{X),W2). 

Proof We recall (inequality (2.4)) that the squared Wasserstein distance is convex w.r.t. 
linear interpolation of measures. Therefore, given two absolutely continuous curves (//^ ) and 

{^if), the curve t ^ '■= ^' is absolutely continuous as well and its metric speed can be 
bounded from above by 

|/jl|2 , I -212 

l/itp < ^ , for a.e. t G (0,oo). (5.3) 

Let {^\) and (/ij) be gradient flows of Entm starting from n G I?(Entm). Then we have 

Ent„(/z) = Entn,(//^) + \ f \fi\\'' dt + \ f |Z)-Ent„|2(//i) dt, VT > 0, 

Entn,(M) = Ent^(/if,) + ]- f dt + ^ / |L»-Ent„,p(^2) > 

^ JO ^ JO 



Adding up these two equalities, using the convexity of the squared slope guaranteed by Propo- 
sition 5.6, the convexity of the squared metric speed given by (5.3) and the strict convexity 
of the relative entropy, we deduce that for the curve 1 1— )■ /it it holds 



If If 

EntM > Entm(/UT) + - / lAtP dt + - / \D-Ent„,\'^{fit) dt, 

^ Jo ^ JO 

for every T such that fj.}p ^ [jS^. This contradicts inequality (2.9). □ 



6 The heat flow as gradient flow 

It is well known that on M'^ the heat flow can be seen both as gradient flow of the Dirichlet 
energy in and as gradient flow of the relative entropy in (^3^2(1^"')) W^2)- It is therefore 
natural to ask whether this identification between the two a priori different gradient flows 
persists or not in a general compact and normalized metric measure space (X, d,m). 

The strategy consists in considering a gradient flow (/t) of Ch with nonnegative initial data 
and in proving that the curve t fit ■= /tin is a gradient flow of Entm(-) in {^(X), W2): by 
the uniqueness result of Theorem 5.7 this will be sufficient to conclude. 

We already built most of the ingredients needed for the proof to work, the only thing that 
we should add is the following lemma, where the slope of Ent^ is bounded from above in 
terms of the notions of "norm of weak gradient" that we discussed in Chapter 4. Notice that 
the bound (6.3) for Lipschitz functions was already known to Lott-Villani ([23]), so that our 
added value here is the use of the density in energy of Lipschitz functions to get the correct, 
sharp inequality (6.1) (sharpness will be seen in (6.4)). 
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Lemma 6.1 (Fisher bounds slope) Let {X, d, m) be a compact and normalized CD{K, oo) 
metric-measure space and let f be a probability density which is Sobolev along a.e. curve. Then 

|I)-EnW|2(/m) < / \£^dm = 4 [ l^v^l^dm. (6.1) 
Jx J Jx 

Proof Assume at first that / is Lipschitz with < c < /, and let (/„) be a sequence of 
probability densities such that VF2(/nni, /m) — )• and where the slope of Entm at fm is 
attained. Choose 7„ G OPT(/m, /nin) and notice that 

/ /log/dm- / /„log/„dm< / (/-/„) log /dm 
Jx Jx Jx 

log fix) - log f{y)) d7„(x, y) 

(6.2) 



- ^1 I '^°^^^d2(x,^^^"^^^^^ d7„(x,j/)^ J d^x,y)d-r^ix,y) 

J L2(x,y)d7„ ,,(?/)) /(2;)dm(x)) W2(/m,/„m), 

where 7„ ^ is the disintegration of 7„ with respect to fm, and L is the bounded Borel function 

I log/(x) - log/(y)| 



L{x,y) :-- 



d(x,.) ' 

|L>log/|(x) = ^^p^ ifx = y. 

/(a;) 



Notice that for every x £ X the map y i— )• L(x,y) is upper-semicontinuous; since 
J ( / d^(x, y) d7„ x)fi^) — )• as n — )• cx), we can assume without loss of generality that 

lim / d'^{x,y) dy^ ^{y) = for /m-a.e. x G X. 

n— >oo J ' 

Fatou's Lemma then yields 

lim / L^(x,y)d7„(x,y) < / L^(x, x)/(x) dm(x) = / dm, 



hence (6.2) gives 



\n~T,,Uf^ — (Entm(/m) - Entn,(/nm))+ ^ f \Df\^ ^ 

\D Entn, /m) = hm — — — < J ——dm. 6.3 

n^oo 1^2(/m,/„m) \l Jx f 

We now turn to the general case. Let / be any probability density Sobolev along a.e. 
curve such that ^/f G D(Ch) (otherwise is nothing to prove). We use Theorem 4.26 to 
find a sequence of Lipschitz functions (\/7n) converging to y/J in L^{X,m) and such that 
|-C\/7ra| ~^ \Dy/f\w in L'^{X,m) and m-a.e.. Up to summing up positive and vanishing con- 
stants and multiplying for suitable normalization factors, we can assume that < Cn < /n 
and Jx fndm = 1, for any n G N. The conclusion follows passing to the limit in (6.3) by 
taking into account the weak lower semicontinuity of |L>~Entm| (formula (2.7) and discussion 
thereafter) . □ 
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Theorem 6.2 (The heat flow as gradient flow) Let fo G L'^{X,m) be such that fiQ = 
foxn S ^{X) and denote by (ft) the gradient flow of Ch in L'^{X,m) starting from fo and by 
(lit) the gradient flow o/Entm in {^{X), W2) starting from hq. Then fit = /^m for any t > 0. 

Proof Thanks to the uniqueness result of Theorem 5.7, it is sufficient to prove that (/tin) 
satisfies the Energy Dissipation Equafity for Ent^ in {^{X),W2)- We assume first that 
0<c</o<C<oo m-a.e. in X, so that the maximum principle (Proposition 4.9) ensures 
0<c</t<C<oo for any t > 0. By Proposition 4.9 we know that t 1— )■ Entm(/tm) is 
absolutely continuous with derivative equal to — j^^ ^^ff"^' "^^^ Lemma 4.23 ensures that 1 1— t- 

ftxn is absolutely continuous w.r.t. W2 with squared metric speed bounded by ^^fl^"' dm, 
so that taking into account Lemma 6.1 we get 

Ent„(/om) > Ent^^(/^m) + ^ ^ j/Jml^ ds + ^ ^ |L>-Entn,|2(/,m) ds, 

which, together with (2.9), ensures the thesis. 

For the general case we argue by approximation, considering := 
c„ min{n, max{/o, 1/n}}, c„ being the normalizing constant, and the corresponding 
gradient flow (/") of Ch. The fact that f^ — t- /q in L^(X, m) and the convexity of Ch implies 
that /" — ^ ft in L^(X, m) for any t > 0. In particular, W2{fl^m, ftm) — > as n — > 00 for 
every t (because convergence w.r.t. W2 is equivalent to weak convergence of measures). 

Now notice that we know that 

Entn,(/o"m) = Ent^n + \ f ds + \ f \D-EYit^\\f^) ds, Vt > 0. 

Furthermore, it is immediate to check that Entni(/o'na) — )■ Entm(/om) as n — )• 00. The 
pointwise convergence of /"m to ftxn w.r.t. W2 easily yields that the terms on the right hand 
side of the last equation are lower semicontinuous when n — )• 00 (recall Theorem 5.2 for the 
slope). Thus it holds 

Entn,(/om) > Ent„(/t) + \ I l/Jml^ ds + \ f \D-Eni^\^{fs) ds, Vt > 0, 

^ JO ^ JO 

which, by (2.11), is the thesis. 

We know, by Theorem 5.7, that there is at most a gradient flow starting from /xq- We also 
know that a gradient fiow // of Ch starting from /o exists, and part {i) gives that /ij := //m 
is a gradient fiow of Entm. The uniqueness of gradient flows gives = f^t all t > 0. 

□ 

As a consequence of the previous Theorem 6.2 it would not be difficult to prove that the 
inequality (6.1) is in fact an identity: if {X,d,m) is a compact and normalized CD{K, 00) 
space, then |L>~Entna|(/tn) < 00 if and only if the probability density / is Sobolev along a.e. 
curve and ^/f G D{Ch); in this case 

|Z?-Ent^|2(/m) = [ ^^^dm = 4 [ \Dy^\ldm. (6.4) 
Jx J J X 
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7 A metric Brenier theorem 



In this section we state and prove the metric Brenier theorem in CD{K, oo) spaces we an- 
nounced in the introduction. It was recently proven in [14] that under an additional non- 
branching assumption one can really recover an optimal transport map, see also [7] for related 
results, obtained under stronger non-branching assumptions and weaker convexity assump- 
tions. 

Definition 7.1 (Strong CD(K, oo) spaces) We say that a compact normalized metric 
measure space {X,d,m) is a strong CD{K, oo) space if for any /iq, fJ-i G -D(Entn,) there 
exists TV S GeoOpt(/xo, /xi) with the following property. For any bounded Borel function 
F : Geo(X) — )• [0, oo) such that J F dn = 1, it holds 

Ent,„(/if ) < (1 - t)Ent^ifM^) + tEnt^ifif) - ^t{l - t)Wiifi^ , fif), 

where fif := (et)j(F7r), for any t G [0, 1]. 

Thus, the difference between strong CD{K, oo) spaces and standard CD{K, oo) ones is the 
fact that geodesic convexity is required along all geodesies induced by the weighted plans Ftt, 
rather than the one induced by tt only. Notice that the necessary and sufficient optimality 
conditions ensure that (cq, 61)^77 is concentrated on a c-monotone set, hence (cq, ei)j(F7r) has 
the same property and it is optimal, relative to its marginals. (We remark that recent results 
of Rajala [28] suggest that it is not necessary to assume this stronger convexity to get the 
metric Brenier theorem - and hence not even a treatable notion of spaces with Riemannian 
Ricci curvature bounded from below - see [2] for progresses in this direction) 

It is not clear to us whether the notion of being strong CD{K, 00) is stable or not w.r.t. 
measured Gromov-Hausdorff convergence and, as such, it should be handled with care. The 
importance of strong CD{K, 00) bounds relies on the fact that on these spaces geodesic 
interpolation between bounded probability densities is made of bounded densities as well, 
thus granting the existence of many test plans. 

Notice that non-branching CD{K, 00) spaces are always strong CD{K, 00) spaces, indeed 
let fiQ, E D(Ent,n) and pick tt G GeoOpt(//0) A^i) such that Ent^ is iC-convex along ((et)j7r). 
From the non-branching hypothesis it follows that for F as in Definition 7.1 there exists a 
unique element in GeoOpt(/if , //f ) (resp. in GeoOpt(/if , /i^)). Also, since F is bounded, 
from nt S -D(Entm) we deduce /if G L'(Entm). Hence the map t 1— )■ Entm(/^f) is -fC-convex 
and bounded on [e, 1] and on [0, 1 — e] for all e G (0, 1), and therefore it is ^C-convex on [0, 1]. 

Proposition 7.2 (Bound on geodesic interpolant) Let (X, d,m) be a strong CD{K, 00) 
space and let hq, fii G ^{X) be with bounded densities. Then there exists a test plan tt G 
GeoOpt { fiQ, fii) so that the induced geodesic nt = (et)jj7r connecting //q to //i is made of 
measures with uniformly bounded densities. 

Proof Let M be an upper bound on the densities of fiQ, fj-i, n £ GeoOpt{fio, fii) be a plan 
which satisfies the assumptions of Definition 7.1 and fit ■= (et)j7r. We claim that the measures 
fit have uniformly bounded densities. The fact that fit is obvious by geodesic convexity, 
so let ft be the density of fit and assume by contradiction that for some to G [0, 1] it holds 

^(x) >Me^"°'/^ VxgA (7.1) 
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where m(A) > and D is the diameter of X. Define vr := ctti , , where c is the normalizing 

^ ' ic-;(A)' ^ 

constant (notice that tt is well defined, because 7r(e^^(A)) = /ito(A) > 0) and observe that 
the density of tt w.r.t. tt is bounded. Let fit ■= i^t)^'^ and ft its density w.r.t. m. From (7.f) 
we get ftQ = cftf) on A and ft^ = on X \ A, hence 

Ent,„(/iiJ = j log(4 o etj dTT > log c + log M + (7.2) 

On the other hand, we have /o < c/o < cM and /i < c/i < cM and thus 

Ent„(/2,) = yiog(/ioei)d# < logc + logM, i = 0,l. (7.3) 

Finally, it certainly holds VF|(/2o,/ii) < D^, so that (7.2) and (7.3) contradict the i^-convexity 
of Entm along {jit)- Hence (7.1) is false and the /j's are uniformly bounded. □ 

An important consequence of this uniform bound is the following metric version of Brenier's 
theorem. 

Theorem 7.3 (A metric Brenier theorem) Let (X, d,m) be a strong CD{K, oo) space, 
let /o, fi be probability densities and (p any Kantorovich potential for the couple (/otn, /im) . 
Then for every tt G GeoOpt(/om, /itn) it holds 

d(7o,7i) = \Dip\w{^o) = \D+ip\{-fo), for n-a.e. 7. (7.4) 

In particular, 



if om, f ixn) = l^^\D^\l fo dm. 



If moreover /o, /i € L°°{X,m) and 7V is a test plan (such a plan exists thanks to Proposition 
7.2) then 

lim^^^f^^ = |i?+^|(7o) ^nL\Geo{X),7^). (7.5) 
40 d(7o,7t) 

Proof (f is Lipschitz, therefore is an upper gradient of and hence ^ 

m-a.e.. Now fix x E X and pick any y £ d'^ip{x). From the c-concavity of (f we get 

nx) = — ^ ^ [y), 

V{z) < - ^^{y) Vz G X. 

Therefore 

/N^d2(z,y) d2(x,y) d{z,y) +d{x,y) 
ip{z) - ip{x) < — < d(z, x) . 

Dividing by d{x,z) and letting z — )■ x, by the arbitrariness of y G d'^ip{x) and the fact that 
supp((eo,ei)j7r) C d^'if we get 

\D'^^\{lo) < min d(7o,y) < d(7o,7i) for 7r-a.e. 7. 
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Since 

l^\D^\lfodm< l\D+^\\jo)d7v and J d\jo,li) d7v{^) = WHf oxn, hm), 

to conclude it is sufficient to prove that 

W^{fom,hm)< [ \D^\lfodm. (7.6) 
Jx 

Now assume that /o and fi are bounded from above and let tt G GeoOpt(/om, /im) be a 
test plan (such tt exists thanks to Proposition 7.2). Since ip is a Kantorovich potential and 
(eo,ei)jj7r is optimal, it holds 71 G d'^if{'yo) for any 7 G supp(7r). Hence arguing as before we 
get 

/ N f N^d2(7o,7i) d2(7j,7i) ^ ^ 
'P[lo)-n7t)> ^ ^ = cl (7o,7i)(t-t /2). (7.7) 

Dividing by d(7o,7t) = td(7o,7i), squaring and integrating w.r.t. tt we obtain 



lim / /" ^(To) ^-^^^ > /■ d2(^p^^^)d,r(7) = t^|(/om,/im). (7.8) 

40^ V d(7o,7t) J J 



m'J V d(7o,7t) J 
Using Remark 4.15 and the fact that fr is a test plan we have 

/ (^%^^^)' "^^^^^ - Ih {J^\D^\Mds^ d7r(7) < j JJ^D^Hi^,) ds dn{j) 

= i ll'\D^\ldsd{et)in = ^ \D^\lfsds dm, 

(7.9) 

where fs is the density of (es)jj7r. Since (et)jj7r weakly converges to (eo)}j7r as t I and 
Entm((et){j7r) is uniformly bounded (by the ET-geodesic convexity), we conclude that ft — >• /o 
weakly in L^{X,m) and since |-D(^|iu G L°°{X,m) we have 

hmi ff\D^\lfsdsdm = [ \D^\lfodm. (7.10) 
40 I J Jo Jx 

Equations (7.8), (7.9) and (7.10) yield (7.6). 

In order to prove (7.6) in the general case of possibly unbounded densities, let 

us fix a Kantorovich potential tt G GeoOpt(/om, /itn) and for n G N define 

tt" := Cn.Tri, , , , ^ , ,^ Cn —5- 1 being the normalization constant. Then tt" G 

l{7:/o(7o)+/i(7i)<"} 

GeoOpt(/(5'm, /"m), where := (ej)((7r", ip is a Kantorovich potential for (/^m, /"m) and 
ff), fi G L°°(X, m). Thus from what we just proved we know that it holds 

d(7o,7i) = \D(p\u,ho) = \D~^ip\{-fo), for 7r"-a.e. 7. 

Letting n — )• 00 we conclude. 

Concerning (7.5), we can choose tt = tt and obtain by (7.7) and (7.4) 

— — — > 0, hmmf — — — > \D^Lp\[-yo) for vr-a.e. 7. 

d 7o,7t 40 d 7o,7t 
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On the other hand (7.9) and (7.10) yield 

hmsup / (^^5;^^)'d^(7) < / |i^VP(70)d.(7), 

so that, by expanding the square and applying Fatou's Lemma, we obtain 

hmsnp / (^(^;hL^ _ |Z^Vl(7o))'dvr(7) < 0. 
40 J ^ a(7o,7t) / 

□ 



8 More on calculus on compact C D{K, oo) spaces 
8.1 On horizontal and vertical derivatives again 

Aim of this subsection is to prove another deep relation between "horizontal" and "vertical" 
derivation, which will allow to compare the derivative of the squared Wasserstein distance 
along the heat flow with the derivative of the relative entropy along a geodesic (see the next 
subsection). This will be key in order to understand the properties of spaces with Rieniannian 
Ricci curvature bounded from below, illustrated in the last section. 

In order to understand the geometric point, consider the following simple example. 

Example 8.1 Let || • || be a smooth, strictly convex norm on and let || • ||* be the dual 
norm. Denoting by (•, •) the canonical duality from (W^)* x into R, let L be the duality 
map from (M*^, || • ||) to ((M'^)*, || • ||*), characterized by 

(£(u),u) = ||£(ti)||*||ti|| and ||/C(ti)||* = ||u|| G M*^, 

and let £* be its inverse, equally characterized by 

{v,L*{v)) = \\v\\,\\L*{v)\\ and \\L*{v)\\ = \\v\\, G {R'^)* . 

Using the fact that e i— ?• + eu'\\ — {Lu,u + en') attains its minimum at e = and the 

analogous relation for £*, one obtains the useful relations 

{L{u),u') = idJI • f (n'), {v',L*{v)) = id„|| • Wliv'). (8.1) 

For a smooth map / : M"' M its differential d.^f at any point x is intrinsically defined as 
cotangent vector, namely as an element of (M*^)*. To define the gradient Vf{x) G (which 
is a tangent vector), the norm comes into play via the formula V/(x) := L*{dxf)- Now, given 
two smooth functions /, g, the real number dx-/(Vg(x)) is well defined as the application of 
the cotangent vector d^/ to the tangent vector S/g{x). 

What we want to point out, is that there are two very different ways of obtaining 
dxf{'Vg{x)) from a derivation. The first one, which is usually taken as the definition of 
dxf{'Vg{x)), is the "horizontal derivative": 

{A J, Vg) = d./(V,(.)) = li + 

t-i-0 t 
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The second one is the "vertical derivative": 

It is not difficult to check that (8.3) is consistent with (8.2): indeed (omitting the x depen- 
dence), recalling the second identity of (8.1), we have 

\\dg + edfWl = \\dg\\l + 2e{L*{dg),df) + o{e) = \\Vgf + 2e{Vg, df) + o{e). 



The point is that the equality between the right hand sides of formulas (8.3) and (8.2) extends 
to a genuine metric setting. In the following lemma (where the plan tt plays the role of —Vg) 
we prove one inequality, but we remark that "playing with signs" it is possible to obtain an 
analogous inequality with < in place of >. 

Lemma 8.2 (Horizontal and vertical derivatives) Let f be a Sobolev function along 
a.e. curve with € L^(X, m), let g : X ^ he Lipschitz and let tv be a test plan 

concentrated on Geo(X). Assume that 

^■^ g(7o)-g(7f) ^i^gi^(^p) in L\Geo{X),7v). (8.4) 
40 a(7o,7t) 

Then 

1^ f /(7.) - /(7.) ^ 1 f mlho) - + ./)li.(70) p 

40 J * 2 7 e 

Proof Define the functions Ft, Gt : Geo(X) M U {±00} by 

a(70,7t) 
a(7o,7t) 

By (8.4) it holds 

j \Dgt o eo d7r(7) = lim j dvr. (8.6) 

Since the measures (et)jj7r — )• (eo)|j7r weakly in duality with C{X) as i | and their densities 
with respect to m are uniformly bounded, we obtain that the densities are weakly* convergent 
in L°°(X,m). Therefore, using the fact that \D{g + ef)\1j G L}{X,m) and taking into account 
Remark 4.15 we obtain 

j \D{g + ef)\l o eo d7r(7) = j \D{g + ef)\l d{^o\^ = lim \ f ^ \D{g + ef)\l d(e,)j7r ds 



hmi [f\D{g + ef)\l{-fs)dsd7v{^)>]im [ 



{g + ef){io)-{g + ef){it) 



id (70, 71) 



d7r(7) 



> hm / G? + 2eGtFt dvr. 
40 ' 
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Subtracting this inequality from (8.6) and dividing by 2e we get 

We know that Gt — )■ \Dg\w o eo in L^(Geo(X), tt) and that \Dg\w{'yo) = d(7o,7i) for 7r-a.e. 7. 
Also, by Remark 4.15 and the fact that tt is a test plan we easily get sup^g^ ||£,2(.n.) < 00. 
Thus it holds 

Urn- / Gt(7)Ft(7)d7r(7) =lim- / d(7o, 7i)i^i(7) d7r(7) = lim / /M^/M d7r(7), 

tio J 40 J m J t 

which is the thesis. □ 



8.2 Two important formulas 

Proposition 8.3 (Derivative of along the heat flow) Let (ft) C L^{X,xn) be a 

heat flow made of probability densities. Then for every a E ^(X), for a.e. t E (0,oo) it 
holds: 



d 1 f 

- — VFf (/(m,cj) = / (pt^ft.dm, for any Kantorovich potential if from ft to a. 
at 2 Jx 



(8.7) 



Proof Since t 1— )• ftxn is an absolutely continuous curve w.r.t. W2 (recall Theorem 6.2), the 
derivative at the left hand side of (8.7) exists for a.e. t E (0,oo). Also, for a.e. t E (0, 00) it 
holds lim/i_>.o j^{ft+h — ft) = A/tj the limit being understood in L^(X, m). 

Fix to such that the derivative of the Wasserstein distance exists and the above limit holds 
and choose any Kantorovich potential ipt^ for (/j^m, cr). We have 



2 

Wi{fto+hm,(T) , 
2 

Therefore, since ipt^ E L°°(X, m) we get 
Wiifto+hm.a) lV|(/tom,a) 



Wi{ft,m,a) f f . ^ [ c . 

- J^^toftodm + J ipt^da 



2 2 



>/ V^toifto+h - fto) <im = h iptgAft^, + o{h) 



X JX 



Dividing by /i < and h > and letting /i — )• we get the thesis. □ 



Proposition 8.4 (Derivative of the Entropy along a geodesic) Let (X, d,m) be a 
strong CD(K, 00) space. Let ^uq, /^i E ^{X), tv E GeoOpt(/iO) /^i) ^^^^ V Kantorovich 
potential for (/io,/Ui). Assume that tv is a test plan and that fiQ > cm from some c > and 
denote by ht the density of fit ■= (et)tj7r. Then 

j.^ Entm(/it) - Entm(/io) ^ ^.^ Chjip) - Chjip + ehp) 
"40" t £ 
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Proof The convexity of Ch ensures that the hmit at the right hand side exists. From the fact 
that ip is Lipschitz, it is not hard to see that ho ^ L'(Ch) implies Ch{ip + eho) = +00 for any 
e > and in this case there is nothing to prove. Thus, we assume that ho € D(Ch). 
The convexity of z 1— t- z log z gives 

Entmint) - Entm(^o) > /" ^ f^t - hp ^ f log(/io Q et) - log(/io o ep) 

Using the trivial inequality given by Taylor's formula 

b-a 16 -aP 



log b — log a > 



a 2c2 

valid for any a, b £ [c, 00), we obtain 

log(/io o et) - log(/io o eo) , ^ /■ /lo o - /iQ o eo , ^ , ^ 

dTT > / dvr / |/io o Cf — /io ° eol dTr. 



t J thQ o Co 2tc2 



.10) 



Taking into account Remark 4.15 and the fact that {jtl = d(7o,7i) < diam(X) for a.e. 
t G (0, 1) and 7r-a.e. 7, the last term in this expression can be bounded from above by 



2^ I (^*diam(X)|moUoe,ydsd7r<^^^^l^ J \Dho\l o es ds dn, (8.11) 



which goes to as t — 0. 

Now let S : Geo(X) — )■ M be the Borel function defined by S{^) := /iq o 70 and define 
TT := -^TT. It is easy to check that (eo)ij7r = m, so that in particular tt is a probability measure. 
Also, the bound hp > c> ensures that fr is a test plan. By definition we have 

hooet- hooeo f ho o et - ho o eo , ^ 
dvr = / dvr. 



tho o Co J t 

The latter equality and inequalities (8.9), (8.10) and (8.11) ensure that to conclude it is 
sufficient to show that 

hm / ho°'t-ho°-o > ^.^Ch(c,)-Ch(^ + eM_ (g^^) 
40 J t e\0 e 

Here we apply the key Lemma 8.2. Observe that Theorem 7.3 ensures that 

|^vU(7o) = lim ^ = d(7o,7i) 

where the convergence is understood in L?'{tt). Thus the same holds for L'^{tt) and the 
hypotheses of Lemma 8.2 are satisfied with tt as test plan, g := ip and / := /iq. Equation 
(8.5) then gives 

fhooet-hpoeo ^-^ii^l f \D^\IM - \Di^ + eho^M 

l^J t -el0 2j £ 

which concludes the proof. □ 
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9 Riemannian Ricci bounds 

We say that (X, d, m) has Riemannian Ricci curvature bounded below hy K gM (in short, it 
is a RCD{K,oo) space) if any of the 3 equivalent conditions stated in the following theorem 
is true. 

Theorem 9.1 Let (X,d,m) be a compact and normalized metric measure space and K gM. 
The following three properties are equivalent. 

(i) (X,d,m) is a strong CD{K, oo) space (Definition 7.1) and the L'^- gradient flow of Ch 
is linear. 

(a) {X, d,m) is a strong CD{K, oo) space (Definition 7.1) and Cheeger's energy is quadratic, 



I.e. 



2(Ch(/) + Ch(5)) =Ch(/ + 5) + Ch(/-5), V/,5GL'(X,m). (9.1) 



(Hi) supp(m) is geodesic and for any /i € Z)(Entm) C ^{X) there exists an FiYIk -gradient 
flow for Entm starting from fi. 

Proof 

(i) =^ (ii). Since the heat semigroup Pj in L^(X, m) is linear we obtain that A is a linear 
operator (i.e. its domain D{A) is a subspace of Lp'{X, m) and A : -D(A) — > L'^{X, m) is linear). 
Since 1 1-)- Ch(Pt(/)) is locally Lipschitz, tends to as t — ;> oo and dtCh{Pt{f )) = -||APt(/)||^2 
for a.e. t > (see (4.4)), we have 

/■oo 

Ch(/) = ||AP,(/)||i.(^^„)dt. 

Therefore Ch, being an integral of quadratic forms, is a quadratic form. Specifically, for any 
f,ge L2(X,m) it holds 

/•oo 

Ch(/ + g) + Ch(/ -g) = J WAPtif + <7)||i.(^,^) + \\APt{f - g)\\h^x,m) dt 

POO 

= ||APi(/) + APi(<7)||i2(^,^) + ||APi(/)-APi(5)||i2(^,^)dt 

j-OO 

= 2||APt(/)|li.(x,.) + 2||APi(5)||i.(^,,)dt 
= 2Ch(/) + 2Ch(g). 

(ii) =^ (iii)- By [31, Remark 4.6(iii)] (supp(m),d) is a length space and therefore it is also 
geodesic, since X is compact. 

Thanks to Remark 2.6 it is sufficient to prove that a gradient flow in the EVI^: sense 
exists for an initial datum /ig ^ Tn with density bounded away from and infinity. Let /o be 
this density, {ft) the heat flow starting from it and recall that from the maximum principle 
4.9 we know that the /t's are far from and infinity as well for any t > 0. Fix a reference 
probability measure a with density bounded away from and infinity as well. For any t > 
pick a test plan vrt optimal for (/^m, o"). Define (t| := (es)n7rt. 

We claim that for a.e. t G (0, oo) it holds 

^lw|(/,m,™) < i^ EnVK-)-E„uW) 

dt 2 s 
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Let i^t be a Kantorovich potential for /^m, am. By Proposition 8.3 we know that for a.e. t G 
(0, oo) it holds 



t: 0^2 iftm, am) = / ^pAft dm < lim 
at 2 Jx eiO 



e 

while from Proposition 8.4 we have that for any t > it holds 

^.^ Entn,(a|) - Entn,(a°) ^ ^.^ Ch((^f ) - Ch(yt + g/Q 

40 S ~ £4,0 £ 

Here we use the fact that Ch is quadratic. Indeed in this case simple algebraic manipulations 
show that 

Ch(/, - e^t) - Ch(/f) _ Ch((^,) - Ch((^, + eh) ^ ^^^^^ ^^^^^ 

and therefore (9.2) is proved. 

Now notice that the X-convexity of the entropy yields 

Entm(crf) — Entmfc?) ^ , , ^ ,„ ^ K^^^o,„ 

m^^^ ^ Ent^(a) - Entn,(/tm) - -Wi{ftm,a), 

and therefore we have 

^^W^l f^m) + Entm(/tm) + Y^^iftm, a) < Ent^{a), for a.e. t G (0, 00). 
By Proposition 2.3 we conclude. 



(iii) =^ (i). Since (supp(m),d) is geodesic, so is (D(Entm), VF2), which together with ex- 
istence of EVIii--gradient flows for Entm yields, via Proposition 2.7, ET-geodesic convexity of 
Entm along all geodesies in £'(Entn,). In particular, (X, d,m) is a strong CD{K, 00) space. 

We turn to the linearity. Let (nt), (fij) be two EVI^'-gradient flows of the relative entropy 
and, for A S (0, 1) fixed, define fi^ := (1 — A)^j + X^j. 

We claim that (fit) is an EVIi^-gradient flow of Entm. To prove this, fix G ^{X), t > 
and an optimal plan 7 € Opt(^^,i/). Since /ij ^ /x^ = 7r^7 for i = 0, 1 we can define, 
as in Definition 5.3, the plans 7^^ G .^{X'^) and the measures z^* := 7jjMt5 ^ = 0, 1. Since 
supp(7^i) C supp(7), we have that 7^i € OPT(/iJ, z^*), therefore from 7 = (1 — A)7^o + A7^i 
we deduce 

Wiifit'y) = (1 - X)W^{filu'>) + XWi{fi],u'). (9.3) 

On the other hand, from the convexity of the squared Wasserstein distance we immediately 
get that 

Wiifi^h, < (1 - A)W^|(^?+h, + AW^K^t+h, ^'), yh > 0. (9.4) 
Furthermore, recalling (iii) of Proposition 5.4, we get 

Entm(^t^) - Entm(z^) < (1 - A)(Entm(/U°) - Entm(z^°)) + X{Ent„,{fi]) - Ent„,{u^)). (9.5) 

The fact that (/x?) and (fij) are EVIj<--gradient flows for Ent^ (see in particular the charac- 
terization (iii) given in Proposition 2.3) in conjunction with (9.3), (9.4) and (9.5) yield 

hm + —W2 (jit , v) + Entm(^i ) < Entn,(z^). (9.6) 

ft4,o / / 



42 



Since t > and u G ^(X) were arbitrary, we proved that (/i^) is a EVI/<-gradient flow of 
Entnt (see again (iii) of Proposition 2.3). 

Thus, recalhng the identification of gradient flows, we proved that the L^-heat flow is 
additive in L)(Entm). Since the heat flow in L'^{X, m) commutes with additive and multipHca- 
tive constants, it is easy to get from this hnearity in the class of bounded functions. By 
contractivity, hnearity extends to the whole of {X, m) . □ 

We conclude by discussing some basic properties of the spaces with Riemannian Ricci 
curvature bounded from below. 

We start observing that Riemannian manifolds with Ricci curvature bounded below by 
K are RCD{K, oo) spaces, as they are non branching CD{K, oo) spaces and the heat flow 
is linear on them. Also, from the studies made in [27], [33], [25] and [16] we also know that 
finite dimensional Alexandrov spaces with curvature bounded from below are RCD{K, oo) 
spaces as well. On the other side, Finsler manifolds are ruled out, as it is known (see for 
instance [26]) that the heat fiow is linear on a Finsler manifold if and only if the manifold is 
Riemannian. 

The stability of the RCD{K, oo) notion can be deduced by the stability of EVIi^-gradient 
flows w.r.t. T-convergence of functionals, which is an easy consequence of the integral formu- 
lation in (a) of Proposition 2.3. 

Hence RCD{K, oo) spaces have the same basic properties of CD{K, oo) spaces, which 
gives to this notion the right of being called a synthetic (or weak) notion of Ricci curvature 
bound. 

The point is then to understand the additional analytic/geometric properties of these 
spaces, which come mainly by the addition of linearity condition. A flrst consequence is that 
the heat fiow contracts, up to an exponential factor, the distance W2, i.e. 

W2{fit, vt) < e-^'W2{fio, 1^0), yt > 0, 

whenever (fJ-t), {vt) C ,'^2{X) are gradient fiows of the entropy. 

By a duality argument (see [21], [15], [6]), this property implies the Bakry-Emery gradient 
estimate 

\Dht{f)\l{x) < e-^'^%{\Df\l){x), for m-a.e. xeX, 

for afi t > 0, where ht : L'^{X,m) L^{X, m) is the heat fiow seen as gradient flow of 
Ch. If (X, d,m) is doubling and supports a local Poincare inequality, then also the Lipschitz 
regularity of the heat kernel is deduced (following an argument described in [15]). 
Also, since in RCD(K, 00) spaces Ch is a quadratic form, if we define 

£(/,5) :=Ch(/ + 5)-Ch(/)-Ch(g), yf,gGW'^\X,d,m), 

we get a closed Dirichlet form on {X, m) (closure follows from the L^-lower semicontinuity 
of Ch). Hence it is natural to compare the calculus on RCD{K, 00) spaces with the abstract 
one available for Dirichlet forms (see [11]). The picture here is pretty clear and consistent. 
Recall that to any / € -D(£) one can associate the energy measure [/] defined by 

[f]{^) :=-£(/, /(p) + £(/V2,(^). 

Then it is possible to show that the energy measure coincides with \Df\1m.. Also, the distance 
d coincides with the intrinsic distance dg induced by the form, defined by 

ddx,y):=snp{\g{x)-giy)\ : g G D{E,) n C{X), [^/] < m}. 
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Taking advantage of these identification and of the locahty of £ (which is a consequence of 
the locahty of the notion \Df\^), one can also see that on RCD{K, oo) spaces a continuous 
Brownian motion with continuous sample paths associated to exists and is unique. 

Finally, for RCD{K, oo) spaces it is possible to prove tensorization and globalization 
properties which are in line with those available for CD{K, oo) spaces. 
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